Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates

<- View Parent
PixelProf@lemmy.ca ⁨2⁩ ⁨months⁩ ago

Not to fully argue against your point, but I do want to push back on the citations bit. Given the way an LLM is trained, it’s not really close to equivalent to me citing papers researched for a paper. That would be more akin to asking me to cite every piece of written or verbal media I’ve ever encountered as they all contributed in some small way to way that the words were formulated here.

Now, if specific data were injected into the prompt, or maybe if it was fine-tuned on a small subset of highly specific data, I would agree those should be cited as they are being accessed more verbatim. The whole “magic” of LLMs was that it needed to cross a threshold of data, combined with the attentional mechanism, and then the network was pretty suddenly able to maintain coherent sentences structure. It was only with loads of varied data from many different sources that this really emerged.

source
Sort:hotnewtop