Comment on An Analysis of DeepMind's 'Language Modeling Is Compression' Paper

<- View Parent
AbouBenAdhem@lemmy.world ⁨1⁩ ⁨year⁩ ago

It looks like they did it both ways (“raw rate” vs “adjusted rate”):

In the case of the adjusted compression rate, the model’s size is also added to the compressed size, i.e., it becomes (compressed size + number of model parameters) / raw size. This metric allows us to see the impact of model parameters on the compression performance. A very large model might be able to compress the data better compared to a smaller model, but when its size is taken into account, the smaller model might be doing better. This metric allows us to see that.

source
Sort:hotnewtop