Comment on Google Researchers’ Attack Prompts ChatGPT to Reveal Its Training Data
pntha@lemmy.world 11 months agohow do we know the ChatGPT models haven’t crawled the publicly accessible breach forums where private data is known to leak? I imagine the crawler models would have some ‘follow webpage-attachments and then crawl’ function. surely they have crawled all sorts of leaked data online but also genuine question bc i haven’t done any previous research.
d3Xt3r@lemmy.nz 11 months ago
We don’t, but from what I’ve seen, those forums either require registration or payment to access the data, and/or some special means to download it (eg: bittorrent). A simple web crawler wouldn’t be able to access it.