you are right, i don’t know how LLMs are trained, but ironically, this is a perfect example of a minority being privelaged by a system.
an important assumption you have to consider: in your example, why did the AI know what race people are in the first place? it seems a small consideration but it’s so wildly significant.
the modern understanding of race was not present throughout all of history, and only arose in the 17th century. without getting into the weeds, the fact that your fictional AI can distinguish between whiteness and non-whiteness already means it was designed by someone who understands those structures, vis a vis, someone who has a modern understanding of race. a perfectly well-meaning and anti-racist designer would prevent the AI from even recognizing race at all costs, both directly by sanitizing training data to remove race from the inputs, and indirectly by noting correlations with other data (such as sports, in this article) and controlling for that.
BluesF@lemmy.world 8 months ago
The bias is really introduced at the design stage. Designers should be aware of demographic differences and incorporate that into the model to produce something more balanced. It’s far from impossible to design models that do not become biased in this way - although I’m not saying it’s easy.