Comment on Congress Wants Tech Companies to Pay Up for AI Training Data

<- View Parent
Motavader@lemmy.world ⁨10⁩ ⁨months⁩ ago

Thanks for the link to Common Crawl; I didn’t know about that project but it looks interesting.

That’s also an interesting point about heavily curated data sets. Would something like that be able to overcome some of the bias in current models? For example, if you were training a facial recognition model, access a curated, open source dataset that has representative samples of all races and genders to try and reduce the racial bias. Anyone training a facial recognition model for any purpose could have a training set that can be peer reviewed for accuracy.

source
Sort:hotnewtop