Comment on In Cringe Video, OpenAI CTO Says She Doesn’t Know Where Sora’s Training Data Came From

<- View Parent
AliasAKA@lemmy.world ⁨3⁩ ⁨months⁩ ago

SoRA is a generative video model, not exactly a large language model.

But to answer your question: if all LLMs did was redirect you to where the content was hosted, then it would be a search engine. But instead they reproduce what someone else was hosting, which may include copyrighted material. So they’re fundamentally different from a simple search engine. They don’t direct you to the source, they reproduce a facsimile of the source material without acknowledging or directing you to it. SoRA is similar. It produces video content, but it doesn’t redirect you to finding similar video content that it is reproducing from. And we can argue about how close something needs to be to an existing artwork to count as a reproduction, but I think for AI models we should enforce citation models.

source
Sort:hotnewtop