THE
WESTERN INSTITUTE
FOR
ADVANCED STUDY
INFORMATION COMPRESSION
Predictions, correlations, and the compression of information in neural networks: Consistencies in a dataset allow information to be parsed for meaning.
Neuroscientists have developed a broad theoretical framework, called predictive processing, that offers a way to describe how our brains process information. At its core, predictive processing suggests that the brain is not just passively gathering information. Instead, it is actively making predictions about the world and updating those predictions based on new information it receives.
​
It turns out, there is a deep connection between predictive processing and thermodynamic efficiency within the brain. To think about this problem, let's consider how the brain handles entropy, a measure of disorder or uncertainty. The brain strives to reduce entropy because less uncertainty means fewer resources are required to process and react to sensory input. A stable, predictable environment allows the brain to operate with greater efficiency. By contrast, a less stable, unpredictable environment will drain energetic resources.
​
Research on thermodynamic computation shows that the brain’s predictive mechanisms are designed to reduce entropy wherever possible. When the brain makes an accurate prediction, it decreases uncertainty, allowing the system to maintain a state of low energy expenditure. But when prediction errors occur, entropy rises, and the brain must work harder to restore balance by updating its models and reducing future errors. Essentially, the brain strives to balance its internal model with the external world to minimize energy usage. Instead of encoding everything, it sort of assumes the world maintains stability, and just encodes new information or surprising events, folding these into the existing cognitive model.
This process is really similar to what ChatGPT is doing when it builds sentences based on user prompts. There is a distribution of possibilities for the next word. The large language model parses its extensive training data, and comes up with a statistical distribution of possible next words, based on the most common patterns in the training data. It then selects the most likely candidate, and repeats the process over and over again. This is a statistical computation, of predicting the best fit, based on previous data - and it works really well, especially if the system is allowed to continue learning with the benefit of user feedback. [Sometimes it's just a little off, though!]​
​
So are we doing something similar to ChatGPT? Or something more interesting? Recent work from the Western Institute for Advanced Study suggests the brain is not merely doing a statistical computation, like ChatGPT - it is actually doing a much more visceral form of computation.
​
A physical process of thermodynamic computation
​​
It may be useful here to talk about entropy. Entropy is a distribution of possible system states - and while we often think of this as an abstract, purely computational quantity, entropy is also a conserved thermodynamic quantity. So free energy must be spent to create that distribution of system states. In this sense, the computational power of the system is deeply tied to the energy efficiency of the system.
​
With ChatGPT, all of the other possibilities still have to be encoded. But in the human brain, we can discard those other possibilities. In this electrically noisy system, free energy converts into entropy. Then, the system interacts with its surrounding environment, and clicks into a mutually compatible state. This is a process of amplifying correlations and vanishing anomalies until only the most likely system state remains. This is a process of information compression! During this computation, entropy is compressed - and because this is a conserved thermodynamic quantity, free energy is released back into the system. Once the computation is complete, that free energy is then used to do work in the system – specifically encoding the meaning that was extracted during the computation.
So just like ChatGPT, the human brain is extracting signals from the noise, or parsing new data for consistencies with the training data. But it seems that our computational process is a visceral, physical process that involves the highly efficient exchange of thermodynamic quantities - while ChatGPT loses energy in huge quantities to encode all the possibilities through a brute force computation.
​
Either way, we are undergoing a computational process - compressing many possibilities into one outcome or extracting a signal from the noise or making a prediction - but our computational process appears to be more energy efficient, and (intriguingly) paired with qualitative perceptual content.