That’s stupid, though. If you can explain 11% of the variance of some noisy phenomenon like cognitive and behavioral flexibility, that’s noteworthy. They tested both linear and quadratic terms, and the quadratic one worked better in terms of prediction, and is also an expression of a meaningful theoretical model, rather than just throwing higher polynomials at it for the fun of it. Quadratic here also would coincide with some homogenizing mechanism at the two ends of the age distribution.
Yet it’s one single sample, and possibly not a great one. Few things could cause the shape seen like sample selection of healthy people ignores a lot more of the 65+ community than the younger, and also stuff like those born around the 50’s have higher lead levels could cause more of a dip, or like… plenty of stuff. After some repetitions sure but even then… that’s 11% hell I could probably put in an exponential with a negative exponent and be as accurate or better.
Sure, you could do some wild overfitting. But why? What substantive theoretical model would such a data model correspond to?
A more straightforward conclusion to draw would be that age is far from the only predictor of flexibility etc., but on the list nevertheless, and if you wanna rule out alternative explanations (or support them), you might have to go and do more observations that allow such arguments to be constructed.
Maybe, yeah, but I kinda get annoyed at this kinda dismissiveness - it’s a type of vague anti-science or something like that. Like… Sure, overfitting is a potential issue, but the answer to that isn’t to never fit any curve when data is noisy, it is (among other things) to build solid theories and good tests thereof. A lot of interesting stuff, especially behavioral things, is noisy and you can’t expect to always have relationships that are simple enough to see.
You’re probably right. But also, I was annoyed, not trying to convince. Maybe not the best place to post from. :)
But I have eyes and the curve they picked as best fit is really poorly fitting. It’s such a poor fit that is almost in a dead zone of the random points.
To be honest, I doubt Munroe wants to say “if the effect is smaller than you, personally, can spot in the scatterplot, disbelieve any and all conclusions drawn from the dataset”. He seems to be a bit more evenhanded than that, even though I wouldn’t be surprised if a sizable portion of his fans weren’t.
It’s kinda weird, scatterplot inspection is an extremely useful tool in principled data analysis, but spotting stuff is neither sufficient nor necessary for something to be meaningful.
But also… an R^2 of .1 corresponds to a Cohen’s d of 0.67. if this were a comparison of groups, roughly three quarters of the control group would be below the average person in the experimental group. I suspect people (including me) are just bad at intuitions about this kinda thing and like to try to feel superior or something and let loose some half-baked ideas about statistics. Which is a shame, because some of those ideas can become pretty, once fully baked.
DonPiano@feddit.org 23 hours ago
That’s stupid, though. If you can explain 11% of the variance of some noisy phenomenon like cognitive and behavioral flexibility, that’s noteworthy. They tested both linear and quadratic terms, and the quadratic one worked better in terms of prediction, and is also an expression of a meaningful theoretical model, rather than just throwing higher polynomials at it for the fun of it. Quadratic here also would coincide with some homogenizing mechanism at the two ends of the age distribution.
TowardsTheFuture@lemmy.zip 7 hours ago
Yet it’s one single sample, and possibly not a great one. Few things could cause the shape seen like sample selection of healthy people ignores a lot more of the 65+ community than the younger, and also stuff like those born around the 50’s have higher lead levels could cause more of a dip, or like… plenty of stuff. After some repetitions sure but even then… that’s 11% hell I could probably put in an exponential with a negative exponent and be as accurate or better.
DonPiano@feddit.org 4 hours ago
Sure, you could do some wild overfitting. But why? What substantive theoretical model would such a data model correspond to?
A more straightforward conclusion to draw would be that age is far from the only predictor of flexibility etc., but on the list nevertheless, and if you wanna rule out alternative explanations (or support them), you might have to go and do more observations that allow such arguments to be constructed.
toynbee@lemmy.world 21 hours ago
Whether you’re right or wrong, starting your argument with “that’s stupid, though” is unlikely to convince many.
dream_weasel@sh.itjust.works 9 hours ago
That’s stupid though. People should change their minds when better information is presented regardless of tone!
DonPiano@feddit.org 14 hours ago
Maybe, yeah, but I kinda get annoyed at this kinda dismissiveness - it’s a type of vague anti-science or something like that. Like… Sure, overfitting is a potential issue, but the answer to that isn’t to never fit any curve when data is noisy, it is (among other things) to build solid theories and good tests thereof. A lot of interesting stuff, especially behavioral things, is noisy and you can’t expect to always have relationships that are simple enough to see.
You’re probably right. But also, I was annoyed, not trying to convince. Maybe not the best place to post from. :)
TimewornTraveler@lemmy.dbzer0.com 20 hours ago
well it convinced me, but I’m stupid and already made up my mind that I wanted to see a reply like that
onslaught545@lemmy.zip 22 hours ago
But I have eyes and the curve they picked as best fit is really poorly fitting. It’s such a poor fit that is almost in a dead zone of the random points.
DonPiano@feddit.org 14 hours ago
I dunno, the point cloud looks to me like some kinda symmetric upward curve. I’d’ve guessed maybe more like R^2=.2 or something in that range, though.
But also: This is noisy, it’s cool to see anything.
SaveTheTuaHawk@lemmy.ca 10 hours ago
It’s a line fitted to a shotgun blast. R2 = 0.11, LOL.
sus@programming.dev 12 hours ago
wtf is up with that confidence interval(?) though
grrgyle@slrpnk.net 9 hours ago
Now this should be an xkcd
DonPiano@feddit.org 4 hours ago
To be honest, I doubt Munroe wants to say “if the effect is smaller than you, personally, can spot in the scatterplot, disbelieve any and all conclusions drawn from the dataset”. He seems to be a bit more evenhanded than that, even though I wouldn’t be surprised if a sizable portion of his fans weren’t.
It’s kinda weird, scatterplot inspection is an extremely useful tool in principled data analysis, but spotting stuff is neither sufficient nor necessary for something to be meaningful.
But also… an R^2 of .1 corresponds to a Cohen’s d of 0.67. if this were a comparison of groups, roughly three quarters of the control group would be below the average person in the experimental group. I suspect people (including me) are just bad at intuitions about this kinda thing and like to try to feel superior or something and let loose some half-baked ideas about statistics. Which is a shame, because some of those ideas can become pretty, once fully baked.