Comment on Simply explained: how does GPT work?

<- View Parent
qwertyasdef@programming.dev ⁨11⁩ ⁨months⁩ ago

Ask it a question about basketball. It looks through all documents it can find about basketball…

I get that this is a simplified explanation but want to add that this part can be misleading. The model doesn’t contain the original documents and doesn’t have internet access to look up the documents (though that can be added as an extra feature, but even then it’s used more as a source to show humans than something for the model to learn from on the fly). The actual word associations are all learned during training, and during inference it just uses the stored weights. One implication of this is that the model doesn’t know about anything that happened after its training data was collected.

source
Sort:hotnewtop