In that case, how did they only choose 8000 posts over 6 years? Facebook probably gets more than 8000 new posts per minute.
Comment on Study of 8k Posts Suggests 40+% of Facebook Posts are AI-Generated
hildegarde@lemmy.blahaj.zone 1 day agoI pretty sure they selected posts from a 6 year period, not that they spent six years on the analysis.
dan@upvote.au 1 day ago
prole@lemmy.blahaj.zone 22 hours ago
I was wondering how far I’d have to scroll before getting to someone who doesn’t understand statistics complaining about the sample size…
dan@upvote.au 18 hours ago
There’s likely been trillions of posts on Facebook during that time frame. Is a sample size of 8000 really sufficient for a corpus that large?
prole@lemmy.blahaj.zone 8 hours ago
Have you ever heard of “margin of error”?
Learn statistics, it’s actually super informative.
hildegarde@lemmy.blahaj.zone 1 day ago
Every study uses sampling. They don’t have the resources to check everything. I have to imagine it took a lot of work to verify conclusively whether something was or was not generated. It’s a much larger sample size than a lot of studies.
dan@upvote.au 1 day ago
I have to imagine it took a lot of work to verify conclusively whether something was or was not generated
The study is by a company that creates software to detect AI content, so it’s literally their whole job.
It’s a much larger sample size than a lot of studies.
It’s a very small proportion of the total number of Facebook posts though.
tal@lemmy.today 1 day ago
It’s an extremely small proportion of the total number of Facebook posts though. Nowhere near enough for statistical significance.
The proportion of the total population size is almost irrelevant when you use random sampling. It doesn’t rely on examining a large portion of the population, but rather that it becomes increasingly unlikely for the sample set to deviate dramatically from the population size. This is a function of the number of samples you take, decoupled from the population size.
billwashere@lemmy.world 1 day ago
I can’t even fathom how they would go about testing if it’s an AI or not. I can’t imagine that’s an exact science either.