The biggest crime against shared knowledge ever committed is photobucket fucking off with the pictures in every “how to fix this car problem” forum post.
Online Content Is Disappearing
Submitted 5 months ago by funn@lemy.lol to technology@lemmy.world
https://www.pewresearch.org/data-labs/2024/05/17/when-online-content-disappears/
Comments
Jode@midwest.social 5 months ago
PseudorandomNoise@lemmy.world 5 months ago
There’s some old Reddit posts like this too. Advice threads where the person who posted a solution went back and overwrote their comments during the boycott last year. I know why they did it but we still lost some information in the grand scheme of things.
infeeeee@lemm.ee 5 months ago
Most of reddit was already archived before: the-eye.eu/redarcs/
Appoxo@lemmy.dbzer0.com 5 months ago
And that is why I criticized the decisions every time I read about it. Every time I got mixed responses but ultimately got a higher downvote ratio.
Also a reason I participate(d) in the archive warrior reddit project.
Zoidsberg@lemmy.ca 5 months ago
And all the “Thanks! Took two minutes to fix after seeing your post” comments just to rub it in.
aniki@lemm.ee 5 months ago
The internet is dying. Everyone knows it.
Hackworth@lemmy.world 5 months ago
The Internet is dead. Long live the Internet!
I’ll have my AI agents talk to your AI agents.
Boozilla@lemmy.world 5 months ago
The Open Web is definitely dying. Some dystopian weaponized ads hellscape of an apps-required shiternet will be around for a while.
rottingleaf@lemmy.zip 5 months ago
That’s an exaggeration. We had nice things back then with forums and ICQ\AIM\others, which we don’t have now, but the tech allows us to have them. It’s the society that has degraded.
jaybone@lemmy.world 5 months ago
The technology is working against it too. App search engines are just spam ads now and will never find that niche forum that has what you are looking for, like they once did 20 years ago.
Anon518@sh.itjust.works 5 months ago
Forums are still around. People just got lazy and started using reddit instead. Search engines are also to blame since they don’t bring up smaller forums in search results. People can go back to forums if they want.
possiblylinux127@lemmy.zip 5 months ago
This is why we need the internet archive
lvxferre@mander.xyz 5 months ago
Yes. And wikis, too.
We (people in general) have a tendency to share stuff in forums, like Lemmy. That’s fine in the short term, but in the long term this stuff should be sorted, organised, and preferably mirrored. Wikis are perfect for that.LainTrain@lemmy.dbzer0.com 5 months ago
This is why Discord is poison to our shared pool of knowledge, it’s such a black hole for many games and software (especially ironically enough open source projects) in lieu of decent docs.
Telodzrum@lemmy.world 5 months ago
Wikis are not really a defense against this issue, they are by nature a secondary or (occasionally by policy) a tertiary source of information. Once the source they are recoding does so does the value of that page on the wiki. From the OP:
54% of Wikipedia pages contain at least one link in their “References” section that points to a page that no longer exists.
blazeknave@lemmy.world 5 months ago
You should see it in person. Just drove by it today. Support them!
KISSmyOSFeddit@lemmy.world 5 months ago
In most cases, this is because an individual page was deleted or removed on an otherwise functional website.
How is this news? I bet a lot of pages were also added in the same time frame, very likely orders of magnitude more.
credo@lemmy.world 5 months ago
I’ve heard the early Internet age referred to as the future dark ages. When all the work, information and content is digitized, it’s prone to being lost to history forever.
rottingleaf@lemmy.zip 5 months ago
Early Internet - yes, but then there’s the middle Internet (or the high Internet if you like, like high Middle Ages) which was in large part scraped by archive.org, and also people generally still knew about offline backups in both eras, and then there’s the late Internet, which moved to siloed services and at the same time most people used it were and are oblivious about preserving data elsewhere. That’s the worst one.
deweydecibel@lemmy.world 5 months ago
My partner works in historical archiving for science and medicine. Museum work, basically. He’s told me so much of the archives are donated collections of notes, letters, journals, and so on from important doctors, researchers, scientists, etc. Donated by the subject themselves in their later years or by their families.
He’s told me there is a growing issue with those people starting to donate entirely digital collections, but even worse than that, are all the documents that are not being stored on a physical hard drive, but on web services and clouds. By the time these people are willing to start donating their things, so much of it has just been deleted forever without them realizing it. Or worse, they die, and their families no longer have access.
Working in IT, I told him about Microsoft’s growing push to eliminate Outlook and PST files, make it all web based email, and he wasn’t surprised, but he was still bummed to hear it. Apparently a not insignificant amount of those donations are locally stored emails.
MysticKetchup@lemmy.world 5 months ago
Because those pages had information that wasn’t on the new pages?
Just from my own experience, WotC migrated the Magic the Gathering site to a new one, and while some articles were brought over there were a whole lot of stories, strategies and event coverage that were lost or are only available thanks to Archive.org
spamfajitas@lemmy.world 5 months ago
I ran across software once that wouldn’t compile properly and the only documentation available was an archive.org hosted backup of an Intel help page that no longer exists. There is no alternative, Intel just removed it entirely.
Lifter@discuss.tchncs.de 5 months ago
Yes. The whole post is a trick with statistics. Web pages have a limited lifespan. You can do the aame trick with human life spans.
“50 % of humans that lived 60 years ago are now dead”. You would tweak the numbers to be factual but something like that makes sense to me.
If you only keep the samples you started out with, of course it’s going to decline over time. The data is guaranteed to not grow since nothing is ever added.
rottingleaf@lemmy.zip 5 months ago
I bet a lot of pages were also added in the same time frame, very likely orders of magnitude more.
No. What you’d make a page for in the 00s, you’d create a FB group or something in the 10s. Hostage to corps and probably too removed for whatever reason.
Appoxo@lemmy.dbzer0.com 5 months ago
And not indexed due to crawling bot-decisions
eskimofry@lemmy.world 5 months ago
Sure a lot of pages of ad infested ads replaced human produced content.
sugar_in_your_tea@sh.itjust.works 5 months ago
And those added pages were probably just as worthless as the ones they replaced.
zooi@feddit.nl 5 months ago
Donate to the internet archive!0
avidamoeba@lemmy.ca 5 months ago
This is important. I signed up a week ago.
anticurrent@sh.itjust.works 5 months ago
This content has been moving from free accessible internet into the walled gardens of social media. we did it ourselves. blogs and forums disappeared, copycat farms and SEO made it so maintaining blog or a community forum a waste of time, everyone is just tiktoking and looking to monetise every bit of content they put on the internet.
K1nsey6@lemmy.world 5 months ago
This allows the ruling class to write history as they see fit.
LibertyLizard@slrpnk.net 5 months ago
I’ve often wondered what the implications of the internet will be for future historians. In the one hand, there is now an enormous body of writings from not just the educated elite as in the past but from all sorts of ordinary people, which is something that has never really existed before.
On the other hand, how and for how long will these writings be retained? If we stop writing things in paper, will these digital writings become completely inaccessible at some point?
EldritchFeminity@lemmy.blahaj.zone 5 months ago
Already a lot of stuff is becoming one harddrive failure away from being lost forever. Companies don’t care about preserving content, so it’s largely up to random people happening to have saved a copy of something for it to still exist at all.
belit_deg@lemmy.world 5 months ago
And National Libraries and similar institutions around the world, for example www.nb.no/en/digital-preservation/
schnurrito@discuss.tchncs.de 5 months ago
Freely licensed works will be preserved a lot better because there will be more copies of them.
Likewise the fediverse is a step in that direction: this message will be federated to hundreds of servers so is more likely to survive longer than if I posted it to reddit.
skillissuer@discuss.tchncs.de 5 months ago
how long? until next sufficiently large solar flare
ricdeh@lemmy.world 5 months ago
There are so many way to adequately protect digital information from solar flares. That would be the least of our problems, the actually dangerous part of geomagnetic storms is the severe power outages and the severance of the electrical grid.
mbirth@lemmy.mbirth.uk 5 months ago
I believe it’s often because nobody does their own website anymore but instead uses managed services, e.g. Medium. Or bits of information, that would’ve been worth a blog post some while ago, end up on sites like StackOverflow, Reddit, etc… And once these services want to monetise these contents, they usually start with limiting public access.
And OTOH TikTok, Instagram Reels and YouTube Shorts are doing everything they can to further limit people’s attention spans and get them addicted to those services. So the people capable of and/or interested in producing proper “content” are dwindling, too.
N01R3@lemmynsfw.com 5 months ago
I remember a small RPG maker game that I no longer can find on the web, let alone anything that used to be hosted on FreewareFiles or Raymond.cc…
moon@lemmy.cafe 5 months ago
I used to be on the rpg maker forums back in the day, can you loosely describe it to me?
N01R3@lemmynsfw.com 5 months ago
I can give you its name too: End of the World, Part 1 and End of the World Part 2. It was a basically a Final Fantasy clone/attempt that I thought when I was younger was pretty good. Can’t remember much about what made it unique though aside from a hidden stick figure fight right outside the castle.
randompasta@lemmy.today 5 months ago
The content isn’t significantly disappearing. It is being consolidated and monetized.
forrgott@lemm.ee 5 months ago
But… they’ve long since figured out how to monetize without the content. So, that’s a hard disagree from me…
schmorpel@slrpnk.net 5 months ago
Yeah, just like most material that was ever printed or carved into a clay tablet. It’s the way of things.
Zedstrian@lemmy.dbzer0.com 5 months ago
The difference is that most of that content lasted for at least a few decades, if not centuries before being lost to time. As content on the internet is ‘destroyed’ if no one hosts it any more, a lot of valuable content is being lost in just a few years after being created. Archiving needs to be more widespread and better supported if the resources and culture of the internet as it has evolved over time are to be preserved for posterity.
Fisch@discuss.tchncs.de 5 months ago
Some government should finally grow the balls to reform copyright, it’s insane that basically the whole world uses this broken system that, among other things, makes archiving illegal
ricdeh@lemmy.world 5 months ago
The thing is, we can do better, it is not a technological problem as during the analogue/paper age with chemical degradation, it is a societal and legal issue.
schmorpel@slrpnk.net 5 months ago
It’s a technological and a physical issue. We just can’t store every bit of information plus a picture of everyone’s cat. We can’t guarantee that no information ever gets lost. We’ve also not really stored and archived every shopping list, advertising, pamphlet, silly poem, ugly drawing etc. since the time of the printing press and that’s okay.
It might be a good idea to store and archive some written material as time passes but we want to be a bit picky about what we store. That said, I wouldn’t mind to find more shopping lists and less posh documents in museums.
bilb@lem.monster 5 months ago
There is no practical reason to “do better.” It’s fine.
ColeSloth@discuss.tchncs.de 5 months ago
If Ian’s shoelace site dissapears, I’ma bounce too.
sbv@sh.itjust.works 5 months ago
That’s pretty interesting. It looks like they define inaccessible links as urls that get a 404 or the server doesn’t resolve.
I wonder if there are any real implications of this. We seem to know it and work around it in some cases, e.g. StackOverflow saying answers need to contain quotes from pages they reference.
EldritchFeminity@lemmy.blahaj.zone 5 months ago
For some real-world examples of this issue, you can look at how the only reason we have any of the early BBC news reels and TV shows is because of copies recorded by people on their TVs. The BBC reused the tapes that they recorded on for new programming to save money on buying tapes. When they started to think about the preservation of news and shows like Dr. Who, they had to turn to the general public and ask them to donate any recordings that they might have made.
It’s estimated that more than 50% of all video games are lost forever because companies didn’t care to save a master copy, and this has already come back to bite some of these companies in the ass with the recent trend of remakes and remasters. There was a recent remake of one of the GTA games from the early 2000s that was very poorly received, and it turned out that the company who worked on it only had the mobile phone port of the game to work with because Rockstar hadn’t bothered to keep a master copy of the game. There was another recent remake of a game that was very obviously done using a pirated copy of the game as the source, because they hadn’t even bothered to remove the cracker’s logo from the game.
With examples like that and Sony recently removing thousands of people’s access to music and movies that they bought on basically a whim, it’s pretty clear that preservation efforts will be done in spite of companies rather than helped by them. And so that means copies of things will be one random harddrive failure of some single person on the internet away from disappearing forever.
ZeffSyde@lemmy.world 5 months ago
Oh, thank fuck. David Bowie’s Area is still online.
morrowind@lemmy.ml 5 months ago
Certain types of tweets tend to go away more often than others. More than 40% of tweets written in Turkish or Arabic are no longer visible on the site within three months of being posted.
I’ve read this is a major problem in Facebook as well, they lack good moderation for these languages and especially the Arabic script and so just remove things heavy handedly to be safe.
randompasta@lemmy.today 5 months ago
The content isn’t significantly disappearing. It is being consolidated and monetized.
errer@lemmy.world 5 months ago
God bless archive.org. Fuck the turds trying to bring it down.
possiblylinux127@lemmy.zip 5 months ago
Donate and contact your government rep
umbrella@lemmy.ml 5 months ago
run the archivewarrior to help them out, donate or pressure your government to stop it from being killed.
Crackhappy@lemmy.world 5 months ago
warrior.archiveteam.org