Comment

Comment on OpenAI Wants to Eat Google Search's Lunch

But the content has already been absorbed. I wouldn’t be surprised if they have all of it sucked up (many would argue illegally) and stored as a corpus for them to iterate onto. It’s not like they go out to touch all the web every time they train a new version of their model.

source

Sort:hotnew top

platypus_plumba@lemmy.world ⁨11⁩ ⁨months⁩ ago
Right, they already have scary amounts of data.

source
- QuaternionsRock@lemmy.world ⁨11⁩ ⁨months⁩ ago
  One of the craziest facts about GPT (to me) is that it was trained on 570GB of text data. That’s obviously a lot of text, but it’s bewildering to me that I could theoretically store their entire training dataset on my laptop.
  
  source