Comment on In Cringe Video, OpenAI CTO Says She Doesn’t Know Where Sora’s Training Data Came From
HaywardT@lemmy.sdf.org 8 months agoInteresting article. It seems to be about a bug, not a designed behavior. It also says it exposes random excerpts from books and other training data.
Linkerbaan@lemmy.world 8 months ago
It’s not designed to do that because they don’t want to reveal the training data. But factually all neural networks are a combination of their training data encoded into neurons.
When given the right prompt (or image generation question) they will exactly replicate it. Because that’s how they have been trained in the first place. Replicating their source images with as little neurons as possible, and tweaking them when it’s not correct.
HaywardT@lemmy.sdf.org 8 months ago
That is a little like saying every photograph is a copy of the thing. That is just factually incorrect. I have many three layer networks that are not the thing they were trained on. As a compression method they can be very lossy and in fact that is often the point.