Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates

<- View Parent
mm_maybe@sh.itjust.works ⁨2⁩ ⁨months⁩ ago

Yeah, I’ve struggled with that myself, since my first AI detection model was technically trained on potentially non-free data scraped from Reddit image links. The more recent fine-tune of that used only Wikimedia and SDXL outputs, but because it was seeded with the earlier base model, I ultimately decided to apply a non-commercial CC license to the checkpoint. But here’s an important distinction: that model, like many of the use cases you mention, is non-generative; you can’t coerce it into reproducing any of the original training material–it’s just a classification tool. I personally rate those models as much fairer uses of copyrighted material, though perhaps no better in terms of harm from a data dignity or bias propagation standpoint.

source
Sort:hotnewtop