Comment

Comment on Federal judge sides with Meta in lawsuit over training AI models on copyrighted books

mindbleach@sh.itjust.works ⁨4⁩ ⁨days⁩ ago

Right, a 12 GB model trained on 100,000,000 images isn’t big enough to contain an MD5 checksum of each.

The same people expect it to identify the authorship of sentence fragments, but never quote one whole paragraph from any book ever. Now: gigabytes of text could be a significant fraction of all books. But finding a single recognizable page is news. Storing text is not what these companies spent a bajillion dollars on.

source

Sort:hotnew top