Simply explained: how does GPT work?

⁨35⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨sizeoftheuniverse@programming.dev⁩ to ⁨programming@programming.dev⁩

source

Comments

Sort:hotnew top

polyfire@waveform.social ⁨1⁩ ⁨year⁩ ago
You know when your typing on your phone and you have that bar above the keyboard showing you what word it thinks you are writing? If you click the word before you finish typing it, it can even show you the word it thinks you are going to write next. Gpt works the same way, it just has waaaay more data that it can sample from.

It’s all just very advanced predictive text algorithms.

Ask it a question about basketball. It looks through all documents it can find about basketball and sees often they reference, hoops, Michael Jordan, sneakers, NBA ect. And just outputs things that are highly referenced in a structure that makes grammatical sense.

For instance, if you had the word ‘basketball’ it knows it’s very unlikely for the word before it to be ‘radish’ and it’s more likely to be a word like ‘the’ or ‘play’ so it just strings it together logically.

That’s the basics anyway.

source
- polyfire@waveform.social ⁨1⁩ ⁨year⁩ ago
  Edit: i see now it’s an article and not just you asking a question lol. I’ll leave it up anyway.
  
  source
- qwertyasdef@programming.dev ⁨1⁩ ⁨year⁩ ago
  
  Ask it a question about basketball. It looks through all documents it can find about basketball…
  
  I get that this is a simplified explanation but want to add that this part can be misleading. The model doesn’t contain the original documents and doesn’t have internet access to look up the documents (though that can be added as an extra feature, but even then it’s used more as a source to show humans than something for the model to learn from on the fly). The actual word associations are all learned during training, and during inference it just uses the stored weights. One implication of this is that the model doesn’t know about anything that happened after its training data was collected.
  
  source
  - polyfire@waveform.social ⁨1⁩ ⁨year⁩ ago
    I wonder what an ELI5 version of ‘stored weights’ would be in this context.
    
    source
    -> View More Comments
abhibeckert@lemmy.world ⁨1⁩ ⁨year⁩ ago
If someone types the word “how are you?” into an SMS message box… chances are really high the other person will respond with “I’m good, how are you?” or something along those lines.

That’s how ChatGPT works. It’s essentially a database of likely responses to questions.

It’s not a fixed list of responses to every possible question, it’s a mathematical one that can handle arbitrary questions and deliver the most likely arbitrary response. So for example if you ask “how you are?” you’ll get the same answer as “how are you?”

ChatGPT is also programmed to behave a certain way - for example if you actually ask how it is, it will tell you it’s not a person and doesn’t have feelings/etc.

Finally - it’s a little bit random, so if you ask the same question 20 times, you’ll get 20 slightly different responses.

source