Comment on Intentionally corrupting LLM training data?
heeplr@feddit.de 1 year agoshow the 20 pages of random words to your users, right?
any dev worth it’s salt is going to check the agent string for GPTBot.
That said, it’s a perfect receipe for getting companies to spoof browsers.
TootSweet@lemmy.world 1 year ago
Yeah, and even if OpenAI uses user agents that identify that bot as GPTBot, there’s no guarantee other scrapers will be so kind.