Comment on A.I. groks 66%-76% faster with data augmentation strategies.

<- View Parent
Hackworth@lemmy.world ⁨4⁩ ⁨months⁩ ago

We follow the classic experimental paradigm reported in Power et al. (2022) for analyzing “grokking”, a poorly understood phenomenon in which validation accuracy dramatically improves long after the train loss saturates. Unlike the previous templates, this one is more amenable to open-ended empirical analysis (e.g. what conditions grokking occurs) rather than just trying to improve performance metrics

source
Sort:hotnewtop