For better experience, turn on JavaScript!

How to calculate the weight of information? (2019)

# How to calculate the weight of information?

#### Background

Those who cannot stand mathematics, please read a lighter version of this blog, which gives you the story without math. It will not explain any detailed calculations while the blog in this page will.

First, we go 150 years back in time when James Maxwell wrote a letter to Peter Tait about his thought experiment that seemed to violate the second law of thermodynamics. Figure 1. The thought experiment described by James Maxwell in 1967. Figure 1. The thought experiment described by James Maxwell in 1967.

Maxwell imagined a demon that he called a 'finite being' who controlled a door between two isolated chambers filled with some gas. This demon observes the speed of the gas molecules, and every time he sees a fast-moving one inside the right container he opens a door and lets it pass to the left side. Similarly, every time the demon sees a slow-moving molecule he again opens the door and allows it move to the right side.

After a while, all the hot molecules would be in the left chamber and all the cold particles inside the right container. This way the demon created order, i.e., reduced entropy seemingly without any outside input of energy (Figure 1).

#### Landauer's Limit

It took over 100 years to solve Maxwell's demon problem. Rolf Landauer while working at IBM, solved the problem by proposing a theoretical limit of energy consumption of computation in 1961.

Laundauer stated that it is possible to perform any computational operation without releasing heat if that operation is logically reversible. However, the catch is that sooner or later available space runs out and we must start to erase information and this erasure costs energy thereby releasing heat; Thus, increasing entropy.

The fantastic consequence of the Landauer's principle is that it is possible to make computers to cool instead of heating up, given that computations don't include any irreversible operations. Consequently, by employing this principle in algorithm designs can improve computational efficiency.

Curiously, this principle applied to us humans means that we might be able to learn efficiently, but when we irreversible forget something, we need to acquire energy to do it.

Maxwell's demon inevitable will also run out of space, and it needs to start to erase information, at which point the entropy begins to increase and heat released requiring an input of energy.

The equation of Landauer's limit states that the minimum energy $$E$$ required to erase one bit is: $E = k_{B}Tln(2)$

(Eq. 1)
where $$k_{B}$$ is Boltzmann's constant, which is approximately $$1.38 \times10^{−23}$$ J/K, and $$T$$ is the temperature of the environment in Kelvin. Landauer further stated that the energy $$E$$ is $E \geq k_{B}Tln(2)$
(Eq. 2)

Bolztmann and Gibbs defined in statistical mechanics thermodynamical entropy $$S$$ as $S = -k_{B}\sum_{i}p_{i}ln(p_{i})$

(Eq. 3)
where $$p_{i}$$ is the probability of a microstate in an equilibrium. A microstate is a specific configuration of particles as opposed to macrostate such as temperature.

Claude Shannon defined information entropy $$H$$ as $H = -\sum_{i}p_{i}log_{2}(P_{i})$

(Eq. 4)
where $$p_{i}$$ is the probability of a symbol in a message.

Equations (3) and (4) are identical, except the base of the logarithm. Boltzmann and Gibb use the natural logarithm and Shannon the logarithm base two. However, it is possible to change the natural logarithm's base to two by dividing it by $$ln(2)$$, for example, $$log_{2}(4) = 2$$ and $$ln(4)\approx 1.386$$ and $$\frac{1.386}{ln(2)} = 2$$ with logarithm base two; Thus, the equations are identical and only given at different scales, except the thermodynamical entropy $$S$$ definition includes the dependency of Boltzmann's constant, which gives the average kinetic energy of particles dependent of temperature, Joule per Kelvin.

#### Weight of information in DNA

The thermodynamic entropy is Joule per Kelvin whereas information entropy is in bits. However, let's store the entire human genome into a computer. We can do the encoding using two bits since the DNA code consists of combinations of four letters A, T, G, and C.

Now the first problem is that a bit in a computer can carry a varied amount of information and is therefore not directly comparable to Shannon's entropy in bits as in Equation 4. The second problem is that similarly as in computers genomes consisting of four-letter codes suffer from the same problem, i.e., each of the letters can carry a varied amount of information. The amount of information depending on its probability of occurrence in a message.

Furthermore, we could count each of the frequencies of the four letters, A, T, C, and G making up a genome and use these frequencies to compute the entropy, but it may be that we cannot consider an entire genome to be a single message. A genome is instead more likely to be a composite of many messages for example how to make each of the different proteins and so on.

Given all the above complications, the exact calculation of the weight of information becomes cumbersome. To simplify the calculations, we assume that all the four-letter codes are equally probable, which is in contrast to the reality that the frequencies vary significantly from one genomic location to another.

With this assumption, we transfer the maximum entropy of DNA to bits in a computer by encoding each letter in a genome using two bits. The entropy of two computer bits, given that 0s and 1s are equally likely is 2 bits, which is precisely the maximum entropy of each letter encoding a genome. Note that even the overall probability for each symbol is equal, the probabilities can vary locally.

Having this way equalized the computer bits and Shannon's entropy bits, we can calculate the minimum amount of energy required to erase one bit. Since we are dealing with the human genome, we use the body temperature of 37 Celcius or 310 Kelvin and utilizing the Equation 1 we can calculate the minimum energy $$E$$ required to erase two bits of information, equivalent to one letter is DNA: $1.38 \cdot 10^{−23} \times 310 \times ln(2) \times 2 \approx 5.9 \cdot 10^{-21} \: J$

Then using Einstein's famous equation of mass-energy equivalence ($$E=mc^{2}$$), we can calculate the mass: $m = \frac{5.9 \cdot 10^{-21}}{299,792,458} \approx 2.0 \cdot 10^{-29} \: kg$

If we assume the world population to be about 7 billion, then the total weight of pure information contained in all humankind is maximally $$4 \cdot 10^{-7} \times 7 \cdot 10^{9} \approx 1 \cdot 10^{-3} \:kg$$ or one milligram.

Yonggun Jun and colleagues verified Landauer's limit in 2014 by trapping fluorescent particles with lasers.

We haven't touched the information content of a brain because it is difficult to estimate accurately and is dependent on individual, environment, and timing as well.

#### Related Topics

We will be covering many related topics in the blogs. Among others, we wonder if one of the current evolutionary paths is the usage of information to gather and create increasingly large amounts of it, which at some point results in entirely new species in some way based on artificial intelligence. After all, we human beings can generate several quintillion bytes of data every day, and our collective maximal weight of information only is about the weight of one grain of sand.

We cover these and many other topics in upcoming blogs. Keep an eye on our homepage and/or the list of blogs page.

A related tutorial for this blog is "Introduction to Information Theory and Its Applications to DNA and Protein Sequence Alignments," which gives an introduction to information theory from the beginning with involving as little math as possible.

#### References

Data Never Sleeps 5.0 domo.com

Shannon C.E. (1948). "A mathematical theory of communication." Bell Syst. Tech. J. 27, 379–423, 623–656. Google Scholar

Szilard L. (1929). "Über die Entropieverminderung in einem thermodynamischen System bei Eingrien intelligenter Wesen." Z. Phys. 53, 840–856. doi:10.1007/BF01341281.

Fisher R.A. (1935) "The logic of inductive inference." J. R. Stat. Soc. 98, 39–82. doi:10.2307/2342435.

Kullback S. "Information theory and statistics." In Wiley 1959 New York, NY:Wiley Google Scholar

Shannon C.E., Weaver W. (1962). "The mathematical theory of communication." In The University of Illinois Press 1962 Urbana, IL:The University of Illinois Press Google Scholar

Jaynes E.T. (2003). "Probability theory. The logic of science." In Cambridge University Press 2003 Cambridge, UK:Cambridge University Press Google Scholar

Karnani M., Pääkkönen K., Annila A. (2009). "The physical character of information." Proc. R. Soc. A. 465 (2107): 2155–75. doi:10.1098/rspa.2009.0063.

Cargill Gilston Knott (1911). "Quote from undated letter from Maxwell to Tait." Life and Scientific Work of Peter Guthrie Tait. Cambridge University Press. pp. 213–215.

Rolf Landauer (1961), "Irreversibility and heat generation in the computing process." IBM Journal of Research and Development, 5 (3): 183–191, doi:10.1147/rd.53.0183

Mario Rabinowitz (2015), "General Derivation of Mass-Energy Relation without Electrodynamics or Einstein’s Postulates." IBM Journal of Research and Development, 5 (3): Journal of Modern Physics, 2015, 6, 1243-1248 Published Online August 2015 in SciRes. http://www.scirp.org/journal/jmp, http://dx.doi.org/10.4236/jmp.2015.69129

Rolf Landauer (1961), "Irreversibility and heat generation in the computing process." IBM Journal of Research and Development, 5 (3): 183–191, doi:10.1147/rd.53.0183

Charles H. Bennett (2003), "Notes on Landauer's principle, Reversible Computation and Maxwell's Demon." Studies in History and Philosophy of Modern Physics, 34 (3): 501–510, doi:10.1016/S1355-2198(03)00039-X

Yonggun Jun; Momčilo Gavrilov; John Bechhoefer (4 November 2014), "High-Precision Test of Landauer's Principle in a Feedback Trap." Physical Review Letters, 113 (19): 190601, doi:10.1103/PhysRevLett.113.190601