Comment on A Project to Poison LLM Crawlers

FaceDeer@fedia.io ⁨15⁩ ⁨hours⁩ ago

Doesn't work, but I guess if it makes people feel better I suppose they can waste their resources doing this.

Modern LLMs aren't trained on just whatever raw data can be scraped off the web any more. They're trained with synthetic data that's prepared by other LLMs and carefully crafted and curated. Folks are still thinking ChatGPT 3 is state of the art here.

source
Sort:hotnewtop