Really nice way to formalize a collective intuition
4 stars
This paper formally equates (lossless) compression algorithms with LLM/learning. While the Hutter Prize has postulated the connection, this paper shows how an LLM can act as a better compressor for multi-modality data than the domain specific standards of today. The authors also use the gzip compression algorithm as a generative model, with rather poor success, but build a mathematical framework to build on.
The paper also covers tokenization as compression, which is something that's been lacking in a lot of other scientific discourse on this subject. Overall a nice read, 4* only because it ends abruptly without fully exploring the space of compressors as generative models.