saw this somewhere else too. ddos stuff. this one blames ru for archive.today mess. sounds about right.
The root of the problem is Wikipedia not having local snapshots leaves their articles vulnerable to eroding sources.
Submitted 2 days ago by m3t00@lemmy.world to technology@lemmy.world
https://lemmy.world/post/43452088
saw this somewhere else too. ddos stuff. this one blames ru for archive.today mess. sounds about right.
The root of the problem is Wikipedia not having local snapshots leaves their articles vulnerable to eroding sources.
Is it reasonable for them to keep their own local snapshots?
That’s not a trivial amount of work and data, particularly it it’s multimedia.
I think it’s a concerning issue affecting long-term viability of the platform. It’ll only get worse as time goes on and sources go offline.
Okay so, what is the currently going-for alternative that bypasses paywalls?
i’ve had consistently good luck with the archive.org wayback machine
copy the headline and find the same thing free somewhere else. usually it’s a news site full of unreadable slop. pay walls used to be almost worth bypassing. no more. just another money grab, pretending to protect valuable information. not
I’m afraid there aren’t any. You can use the Bypass Paywalls Clean extension though
Oh well, archive.today it is in the meantime I guess.
As someone who uses Bypass Paywalls Clean, this is so frustrating.
Bypass Paywalls Clean was chased off of the Firefox Add-Ons site, chased off of Gitlab, and chased off of Github via DMCA takedown notices for copyright infringement. It is now hosted on the Russian Gitflic.ru.
We all know Russia sucks in a litany of ways, but one way it doesn’t suck is that it is one of the few countries left that has really thrown all caution to the wind and absolutely said “fuck it” in terms of respecting the international Copyright norms as promoted by and deeply influenced by the USA copyright cabal.
We have spent the better part of two decades dealing with the DMCA being used as an outright weapon to silence information that corporations and government find inconvenient mostly because that information is wildly incriminating for them.
Websites like Anna’s Archive, Libgen, and Sci-Hub live because they use hosting in countries that allow them to bypass these kind of restrictions. Russia is one of the most common countries for them to host the data out of due to the lack of enforcement of copyright laws, although it is obviously not the only country that these sites use.
Until we are able to alter international copyright protections to be reasonable instead of their current over-zealously and aggressively abusive nature, we will all suffer having to risk hosting of such sites in countries that are otherwise very unsavory for be associating with.
And now Firefox completely bans it from even being sideloaded.
I’m with you on this, but let’s be careful here.
We all know Russia sucks in a litany of ways, but one way it doesn’t suck is that it is one of the few countries left that has really thrown all caution to the wind and absolutely said “fuck it” in terms of respecting the international Big Copyright norms as promoted by and deeply influenced by the USA copyright cabal (RIAA/MPAA).
I once made a YouTube video which somehow included a clip from some RT Russian TV bullshit show. (The show was in fact a direct ripoff of Gordon Ramsey’s Hell Kitchen, for which I’m sure they did not get license for.)
Some fucking Russian troll bots then DMCA’d my YouTube video, for using their clip, even though it was clearly “fair use” in US jurisdiction, and YouTube happily sucked their russian dicks and flagged and removed my video.
And my video had probably 15 views, like it wasn’t a big thing.
So they aren’t exactly the Robin Hood of free speech.
Of course they aren’t, they will happily block information that they dislike because it’s embarrassing and incriminating to them. Skepticism should cut both ways, skeptical of those who use Russian connection to delegitimize valuable tools and the people associated with them, and skepticism of why Russia allows those things to persist providing they impact Western countries but not Russia.
Not sure how this says anything about Russian copyright laws or Russian government.
hey thanks, i had never heard of that bypass paywalls firefox addon
There’s also a version for Chrome if you swing that way.
Ironically, when Russia was joining the World Trade Organization in early 2010s, one requirement was for them to block pirate sites, namely torrent-sharing ones. Which they did, and the sites are blocked to this day.
I don’t think the issue is paywalls. I think the issue is the personal actions of the owner. I also really don’t think Russia plays into this. Again, the personal actions of the owner of achive[.]today were the reason it was removed. The site was used by the owner to personally attack someone.
Is your comment in the thread about Wikipedia banning archive.today?
Original post title was:
Until further notice: archive.today/archive.is/archive.ph/… is banned from this community for apparently being a Russian DDOS tool linking to the /c/ukraine community which posted it.
Also, from the Ars story:
Patokallio wasn’t able to determine who runs Archive.today but mentioned apparent aliases such as “Denis Petrov” and “Masha Rabinovich,” and described evidence that the site is operated by someone from Russia.
The reason it matters:
It makes people suspect of anything hosted in Russia, which is frustrating because there’s a lot of valuable shit hosted there by people who are not necessarily from there, such as Alexandra Elbakyan, who has had many accusations tossed her way due to her websites association with Russia:
In December 2019, The Washington Post reported that Elbakyan was under investigation by the US Justice Department for suspected ties to Russia’s military intelligence arm, the GRU, to steal U.S. military secrets from defense contractors. Elbakyan has denied this, saying that Sci-Hub “is not in any way directly affiliated with Russian or some other country’s intelligence,” but noting that “of course, there could be some indirect help. The same as with donations, anyone can send them; they are completely anonymous, so I do not know who exactly is donating to Sci-Hub. There could be some help that I’m simply unaware of. I can only add that I write all of Sci-Hub code and design myself and I’m doing the server’s configuration.”
This is understandable, but at the same time, none of the anti-paywall lists are as good as archive.today. They actually have paid accounts at a bunch of paywalled sites, and use them when scraping.
Unfortunately, they’ve allegedly modified the contents of some archived articles, so even though they may do better to archive, nothing archived is of any value because it cannot be trusted.
So are they removing all other websites that post lies or modify their articles to suit their narrative at times?
Fox news? MSN? CNN? BBC? Reuters? AP?
Why the sudden urge to validate the archives? How many articles have been proven to be modified?
Seems like they’ve been wanting to remove an entity the empire doesn’t control and they’re using this as a cover to do it.
What if somebody used archive.today to bypass a paywall and then archived that using Web Archive? (So we’re sure the content stays the same)
For anyone curious, I looked into the DDOSing, and what was done is a simple string of JavaScript was added to archive[.]today that made a background request to the blog with a randomly generated search parameter. Every time someone looked at an archive, they unknowingly sent a request to the blog under attack.
Good reminder to donate to web.archive.org
While archive.org is good and more trustworthy than archive.is, it isn’t as useful for bypassing paywalls.
But Wikipedia doesn’t need to bypass paywalls, and you can bypass them yourself with a bit of work.
I do hope this move results in more support for the IA/Wayback Machine and helps them to update some of their crawler tech — thanks to the rise of AI, some sites are effectively (thru captchas etc.) or actively (through straight-up greed [coughRedditcough]) blocked from being archived almost entirely, which is frustrating for legit archivists/contributors.
Good reminder to pay for independent journalism.
The Guardian, Le Monde, El País, Tageszeitung and many others need subscribers to stay independent of oligarchs.
Also remember the journalists that need support the most are local papers and news stations. The big ones have plenty of donors, and while it’s worth the support, they are less likely to completely collapse than the news that is run in your city.
Go look for that independent source. They will report more news that actually affects you as well.
guardian is surviving by slowly becoming a tabloid. not sure if i would have paid for it anyway, and im not sure if this was preventable by paying for it in the first place.
yeah and they’re also transphobic af as a policy. don’t give them a damn cent
buzzfeed.com/…/guardian-staff-trans-rights-letter
can also find more stuff by just looking up “the guardian transphobia”
I appreciate the guardian a lot more than I did before now that someone gave me a nytimes subscription, seeing how bad they are now. For the guardian’s faults, they do break some stories still, and somewhat comprehensively cover the news, perhaps better than the times, that is too busy trying to cover for Israel to even report honestly on epstein and apparently surrendered to the administration besides.
Paying for journalism simply promotes that those who don’t pay it don’t get it ie.: more paywalls, not less.
So everybody needs to stop paying for journalism so the journalists stop travelling, eating and having to pay their rent so journalism eventually becomes free for everybody?
If there is no money to pay for the work it takes, journalism will simply cease to exist.
Paying for journalism is ideal, but unfortunately makes it difficult to cite/link to a source the way Wikipedia needs as a way to ensure the information remains open and accessible.
Admittedly, I’m not familiar with these outlets enough to know if those paywalls are significant, but the problem with direct article links is that those links can change. Archival services (I suppose not archive[.]is) are important for ensuring those articles remain accessible in the format they were presented in.
I’ve come across a number of older Wikipedia articles about more minor or obscure events where links lead to local new outlet websites that no longer exist or were consumed by larger media outlets and as a result no longer provide an appropriate citation.
If this is not an announcement, Lemmy lets you edit your post titles so you can correct that mistake instead of luring in people who think lemmy.world is also banning links using archive.today.
I’m not speculating on your intent, only that you can correct this situation instead of apologizing after the fact.
Everyone seems to be ignoring the fact that he only did this in response to a malicious dox attempt.
He only modified archived pages in response to a dox attempt?
And the thing is, the discovery of the modified pages revealed that it wasn’t even the first time he’d modified pages. And he used a real person’s identity to try and shift blame.
Irrespective of the doxxing allegations, if he’s done all this multiple times already, it means the page archives can’t be trusted AND there’s no guarantee that anything archived with the service will be available tomorrow.
Seems like we need to switch to URLs that contain the SHA256 of the page they’re linking to, so we can tell if anything has changed since the link was created.
It wasn’t a dox attempt though. The blog just collected information that was already publicly available on other sites.
As they should since it doesn’t matter.
Yeah, someone being shitty to you doesn’t mean you full-fledged shitty in return, it kind of proves your lack of trustworthiness to begin with. It’s like Nazis being like “leftists were mean to me by explaining how my politics made me a Nazi, so I’m gonna show them by Nazi-ing even harder!” It kind of betrays the argument that the reason you got that way was because leftists were mean to you.
Unfortunately, they shot themselves in the foot by responding the way they did. They basically did the job of anyone who wants them taken down and not trusted. It was probably the worst way they could have reacted. Such a tragedy to lose such a valuable website.
Yeah, ESH. His response of editing an archive showed the site to be unreliable as an archive. DDOSing from the site as a counter to the dox attempt caused the site serious reputational harm as well.
It sucks because his site was actually more reliable than The Internet Archive.
lemmy.world/c/ukraine was where i saw this. i didn’t write it. fyi
I’ve switched to .md when the community mentioned something was up with the .today domain. Hopefully that one isn’t compromised.
Bro any archiving/scraping tool can be used for ddos u just tell it to archive the same site over and over and now u have a different IP spamming the endpoint
In this case, their CAPTCHA page intentionally included code to DDoS a particular blog.
Any good archiver will check for an archived copy before making a request, and batch requests. This was very different than the attack you’re imagining — if you opened any archive.today page, it would poll a developer’s personal blog, regardless of whether you were interacting with content from that blog.
don’t know all the details. fyi basically. i forget where i saw the same site mentioned for the same thing. don’t call me bro Bro
How does the paywall circumvention of archive.today works?
It identifies itself as a google (or other) crawler, which sites often allow and give the full content to, for better SEO.
I guess that they genuinely owned subscriptions for popular paywalled sites.
Formfiller@lemmy.world 3 hours ago
That’s very 1984 of them