Compression and decompression of information in LLMsLLMLarge language model trained on large text corpora to predict tokens and generate language.
LLMsLLMLarge language model trained on large text corpora to predict tokens and generate language. are machines that compress and dilate the information contained in messages. From the GPT-5 family onward, or with Gemini-3 and Claude 4.5, LLMsLLMLarge language model trained on large text corpora to predict tokens and generate language. became so good at information compression that one of their responses is often too dense for a human reader.
With the first generations of language models, we were often confronted with redundant messages. Many specialists in their fields still criticized stereotyped responses from AIAIArtificial intelligence: systems that perform tasks requiring human cognition, typically via statistical models. systems, which seemed interesting on the surface but were severely lacking in substance.
Today, LLMsLLMLarge language model trained on large text corpora to predict tokens and generate language. have become so capable that human readers often have to find ways to decompress the information contained in a response. Finding the right density/comprehensibility ratio is essential. But this ratio varies from one individual to another, from one domain to another, from one situation to another. LLMsLLMLarge language model trained on large text corpora to predict tokens and generate language. are precision tools. When calibrated correctly, their results are markedly better. The same applies to this ratio.
Implications
Important: the theoretical background for these reflections appears in thinkers such as Claude Shannon and Andrey Kolmogorov. The two mathematicians sought, each in his own way, to answer the question of the maximum information a given message could contain. LLMsLLMLarge language model trained on large text corpora to predict tokens and generate language. force us to update their research with a new angle: they can modulate the density of information in a message automatically. What can that be used for? It is a very theoretical model, but it can lead to new communication protocols in which the message is "optimized" for its reader.