I’m not “trying to be nice to minority languages”, I’m directly pushing back against the chauvinistic idea that the English Wikipedia is so important that those without it are somehow inferior. There is no “doom spiral”.
I think you missed the problem described here.
The “doom spiral” is not because of English Wiki, it has nothing to do with anything.
The problem described is that people who don’t know a “niche” language try to contribute to a niche Wiki by using machine translation/LLMs.
As per the article:
Virtually every single article had been published by people who did not actually speak the language. Wehr, who now teaches Greenlandic in Denmark, speculates that perhaps only one or two Greenlanders had ever contributed. But what worried him most was something else: Over time, he had noticed that a growing number of articles appeared to be copy-pasted into Wikipedia by people using machine translators. They were riddled with elementary mistakes—from grammatical blunders to meaningless words to more significant inaccuracies, like an entry that claimed Canada had only 41 inhabitants. Other pages sometimes contained random strings of letters spat out by machines that were unable to find suitable Greenlandic words to express themselves.
Now, another problem is Model Collapse (or, well, a similar phenomenon in strictly in terms of language itself).
We now have a bunch of “niche” languages’ Wikis containing such errors… that are being used to train machine translators and LLMs to handle these languages. This is contaminating their input data with errors and hallucinations, but since this is the training data, these LLMs consider everything in there as the truth, propagating the errors/hallucinations forward.
I honestly have no clue where you’re getting anything chauvinistic here.
HereIAm@lemmy.world 1 day ago
No one is saying those who can’t access or reqd English wikipedia is inferior. The issue here is when what is on a non-english wikipedia article is misleading or flat out harmful (like the article says about growing crops), because of juvenile attempts at letting machine translations getting it very wrong. So what Greenland did was shut down its poorly translated and maintained wiki site instead of letting it fester with misinformation. And this issue compounding when LLMs scrape Wikipedia as a source to learn new languages.