Entropy in thermodynamics and information theory

From Wikipedia, the free encyclopedia
Jump to: navigation, search

There are close parallels between the mathematical expressions for the thermodynamic entropy, usually denoted by S, of a physical system in the statistical thermodynamics established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s, and the information-theoretic entropy, usually expressed as H, of Claude Shannon and Ralph Hartley developed in the 1940s. Shannon, although not initially aware of this similarity, commented on it upon publicizing information theory in A Mathematical Theory of Communication.

This article explores what links there are between the two concepts, and how far they can be regarded as connected.

Equivalence of form of the defining expressions[edit]

Boltzmann's grave in the Zentralfriedhof, Vienna, with bust and entropy formula.

The defining expression for entropy in the theory of statistical mechanics established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s, is of the form:

where is the probability of the microstate i taken from an equilibrium ensemble.

The defining expression for entropy in the theory of information established by Claude E. Shannon in 1948 is of the form:

where is the probability of the message taken from the message space M and b is the base of the logarithm used. Common values of b are 2, Euler's number e, and 10, and the unit of entropy is bit for b = 2, nat for b = e, and dit (or digit) for b = 10.[1]

Mathematically H may also be seen as an average information, taken over the message space, because when a certain message occurs with probability pi, the information −log(pi) will be obtained.

If all the microstates are equiprobable (a microcanonical ensemble), the statistical thermodynamic entropy reduces to the form, as given by Boltzmann,

where W is the number of microstates.

If all the messages are equiprobable, the information entropy reduces to the Hartley entropy

where is the cardinality of the message space M.

The logarithm in the thermodynamic definition is the natural logarithm. It can be shown that the Gibbs entropy formula, with the natural logarithm, reproduces all of the properties of the macroscopic classical thermodynamics of Rudolf Clausius. (See article: Entropy (statistical views)).

The logarithm can also be taken to the natural base in the case of information entropy. This is equivalent to choosing to measure information in nats instead of the usual bits. In practice, information entropy is almost always calculated using base 2 logarithms, but this distinction amounts to nothing other than a change in units. One nat is about 1.44 bits.

For a simple compressible system that can only perform volume work, the first law of thermodynamics becomes

But one can equally well write this equation in terms of what physicists and chemists sometimes call the 'reduced' or dimensionless entropy, σ = S/k, so that

Just as S is conjugate to T, so σ is conjugate to kBT (the energy that is characteristic of T on a molecular scale).

Theoretical relationship[edit]

Despite the foregoing, there is a difference between the two quantities. The information entropy H can be calculated for any probability distribution (if the "message" is taken to be that the event i which had probability pi occurred, out of the space of the events possible), while the thermodynamic entropy S refers to thermodynamic probabilities pi specifically. The difference is more theoretical than actual, however, because any probability distribution can be approximated arbitrarily closely by some thermodynamic system.

Moreover, a direct connection can be made between the two. If the probabilities in question are the thermodynamic probabilities pi: the (reduced) Gibbs entropy σ can then be seen as simply the amount of Shannon information needed to define the detailed microscopic state of the system, given its macroscopic description. Or, in the words of G. N. Lewis writing about chemical entropy in 1930, "Gain in entropy always means loss of information, and nothing more". To be more concrete, in the discrete case using base two logarithms, the reduced Gibbs entropy is equal to the minimum number of yes–no questions needed to be answered in order to fully specify the microstate, given that we know the macrostate.

Furthermore, the prescription to find the equilibrium distributions of statistical mechanics—such as the Boltzmann distribution—by maximising the Gibbs entropy subject to appropriate constraints (the Gibbs algorithm) can be seen as something not unique to thermodynamics, but as a principle of general relevance in statistical inference, if it is desired to find a maximally uninformative probability distribution, subject to certain constraints on its averages. (These perspectives are explored further in the article Maximum entropy thermodynamics.)

The Shannon entropy in information theory is sometimes expressed in units of bits per symbol (Chapter 1, section 7). The physical entropy may be on a "per quantity" basis (h) which is called "intensive" entropy instead of the usual total entropy which is called "extensive" entropy. The "shannons" of a message (H) are its total "extensive" information entropy and is h times the number of bits in the message.

A direct and physically real relationship between h and S can be found by assigning a symbol to each microstate that occurs per mole, kilogram, volume, or particle of a homogeneous substance, then calculating the 'h' of these symbols. By theory or by observation, the symbols (microstates) will occur with different probabilities and this will determine h. If there are N moles, kilograms, volumes, or particles of the unit substance, the relationship between h (in bits per unit substance) and physical extensive entropy in nats is:

where ln(2) is the conversion factor from base 2 of Shannon entropy to the natural base e of physical entropy. N h is the amount of information in bits needed to describe the state of a physical system with entropy S. Landauer's principle demonstrates the reality of this by stating the minimum energy E required (and therefore heat Q generated) by an ideally efficient memory change or logic operation by irreversibly erasing or merging N h bits of information will be S times the temperature which is

where h is in informational bits and E and Q are in physical Joules. This has been experimentally confirmed.[2]

Temperature is a measure of the average kinetic energy per particle in an ideal gase (Kelvins = 2/3*Joules/kb) so the J/K units of kb is fundamentally unitless (Joules/Joules). kb is the conversion factor from energy in 3/2*Kelvins to Joules for an ideal gas. If kinetic energy measurements per particle of an ideal gas were expressed as Joules instead of Kelvins, kb in the above equations would be replaced by 3/2. This shows that S is a true statistical measure of microstates that does not have a fundamental physical unit other than the units of information, in this case "nats", which is just a statement of which logarithm base was chosen by convention.

Information is physical[edit]

Szilard's engine[edit]

N-atom engine schematic

A physical thought experiment demonstrating how just the possession of information might in principle have thermodynamic consequences was established in 1929 by Leó Szilárd, in a refinement of the famous Maxwell's demon scenario.

Consider Maxwell's set-up, but with only a single gas particle in a box. If the supernatural demon knows which half of the box the particle is in (equivalent to a single bit of information), it can close a shutter between the two halves of the box, close a piston unopposed into the empty half of the box, and then extract joules of useful work if the shutter is opened again. The particle can then be left to isothermally expand back to its original equilibrium occupied volume. In just the right circumstances therefore, the possession of a single bit of Shannon information (a single bit of negentropy in Brillouin's term) really does correspond to a reduction in the entropy of the physical system. The global entropy is not decreased, but information to energy conversion is possible.

Using a phase-contrast microscope equipped with a high speed camera connected to a computer, as demon, the principle has been actually demonstrated.[3] In this experiment, information to energy conversion is performed on a Brownian particle by means of feedback control; that is, synchronizing the work given to the particle with the information obtained on its position. Computing energy balances for different feedback protocols, has confirmed that the Jarzynski equality requires a generalization that accounts for the amount of information involved in the feedback.

Landauer's principle[edit]

Main article: Landauer's principle

In fact one can generalise: any information that has a physical representation must somehow be embedded in the statistical mechanical degrees of freedom of a physical system.

Thus, Rolf Landauer argued in 1961, if one were to imagine starting with those degrees of freedom in a thermalised state, there would be a real reduction in thermodynamic entropy if they were then re-set to a known state. This can only be achieved under information-preserving microscopically deterministic dynamics if the uncertainty is somehow dumped somewhere else – i.e. if the entropy of the environment (or the non information-bearing degrees of freedom) is increased by at least an equivalent amount, as required by the Second Law, by gaining an appropriate quantity of heat: specifically kT ln 2 of heat for every 1 bit of randomness erased.

On the other hand, Landauer argued, there is no thermodynamic objection to a logically reversible operation potentially being achieved in a physically reversible way in the system. It is only logically irreversible operations – for example, the erasing of a bit to a known state, or the merging of two computation paths – which must be accompanied by a corresponding entropy increase. When information is physical, all processing of its representations, i.e. generation, encoding, transmission, decoding and interpretation, are natural processes where entropy increases by consumption of free energy.[4]

Applied to the Maxwell's demon/Szilard engine scenario, this suggests that it might be possible to "read" the state of the particle into a computing apparatus with no entropy cost; but only if the apparatus has already been SET into a known state, rather than being in a thermalised state of uncertainty. To SET (or RESET) the apparatus into this state will cost all the entropy that can be saved by knowing the state of Szilard's particle.

Negentropy[edit]

Main article: Negentropy

Shannon entropy has been related by physicist Léon Brillouin to a concept sometimes called negentropy. In 1953, Brillouin derived a general equation[5] stating that the changing of an information bit value requires at least kT ln(2) energy. This is the same energy as the work Leo Szilard's engine produces in the idealistic case. In his book,[6] he further explored this problem concluding that any cause of a bit value change (measurement, decision about a yes/no question, erasure, display, etc.) will require the same amount, kT ln(2), of energy. Consequently, acquiring information about a system’s microstates is associated with an entropy production, while erasure yields entropy production only when the bit value is changing. Setting up a bit of information in a sub-system originally in thermal equilibrium results in a local entropy reduction. However, there is no violation of the second law of thermodynamics, according to Brillouin, since a reduction in any local system’s thermodynamic entropy results in an increase in thermodynamic entropy elsewhere. In this way, Brillouin clarified the meaning of negentropy which was considered as controversial because its earlier understanding can yield Carnot efficiency higher than one.

In 2009, Mahulikar & Herwig redefined thermodynamic negentropy as the specific entropy deficit of the dynamically ordered sub-system relative to its surroundings.[7] This definition enabled the formulation of the Negentropy Principle, which is mathematically shown to follow from the 2nd Law of Thermodynamics, during order existence.

Black holes[edit]

Stephen Hawking often speaks of the thermodynamic entropy of black holes in terms of their information content.[8] Do black holes destroy information? It appears that there are deep relations between the entropy of a black hole and information loss[9] See Black hole thermodynamics and Black hole information paradox.

Quantum theory[edit]

Hirschman showed,[10] cf. Hirschman uncertainty, that Heisenberg's uncertainty principle can be expressed as a particular lower bound on the sum of the classical distribution entropies of the quantum observable probability distributions of a quantum mechanical state, the square of the wave-function, in coordinate, and also momentum space, when expressed in Planck units. The resulting inequalities provide a tighter bound on the uncertainty relations of Heisenberg.

One could speak of the "joint entropy" of the position and momentum distributions in this quantity by considering them independent, but since they are not jointly observable, they cannot be considered as a joint distribution. Note that this entropy is not the accepted entropy of a quantum system, the Von Neumann entropy, −Tr ρ lnρ = −⟨lnρ⟩. In phase-space, the Von Neumann entropy can nevertheless be represented equivalently to Hilbert space, even though positions and momenta are quantum conjugate variables; and thus leads to a properly bounded entropy distinctly different (more detailed) than Hirschman's; this one accounts for the full information content of a mixture of quantum states.[11]

(Dissatisfaction with the Von Neumann entropy from quantum information points of view has been expressed by Stotland, Pomeransky, Bachmat and Cohen, who have introduced a yet different definition of entropy that reflects the inherent uncertainty of quantum mechanical states. This definition allows distinction between the minimum uncertainty entropy of pure states, and the excess statistical entropy of mixtures.[12])

The fluctuation theorem[edit]

The fluctuation theorem provides a mathematical justification of the second law of thermodynamics under these principles, and precisely defines the limitations of the applicability of that law for systems away from thermodynamic equilibrium.

Criticism[edit]

There are many criticisms of the link between thermodynamic entropy and information entropy, and some are not without merit.

The most common criticism is that information entropy cannot be related to thermodynamic entropy because there is no concept of temperature, energy, or the second law, in the discipline of information entropy.[13][14][15][16][17] This can best be discussed by considering the fundamental equation of thermodynamics:

where the Fi are "generalized forces" and the dxi are "generalized displacements". This is analogous to the mechanical equation dE = F dx where dE is the change in the kinetic energy of an object having been displaced by distance dx under the influence of force F. For example, for a simple gas, we have:

where the temperature (T ), pressure (P ), and chemical potential (µ ) are generalized forces which, when imbalanced, result in a generalized displacement in entropy (S ), volume (-V ) and quantity (N ) respectively, and the products of the forces and displacements yield the change in the internal energy (dU ) of the gas.

In the mechanical example, to declare that dx is not a geometric displacement because it ignores the dynamic relationship between displacement, force, and energy is not correct. Displacement, as a concept in geometry, does not require the concepts of energy and force for its definition, and so one might expect that entropy may not require the concepts of energy and temperature for its definition. The situation is not that simple, however. In classical thermodynamics, which is the study of thermodynamics from a purely empirical, or measurement point of view, thermodynamic entropy can only be measured by considering energy and temperature. Clausius' statement dS= δQ/T, or, equvalently, when all other effective displacements are zero, dS=dU/T, is the only way to actually measure thermodynamic entropy. It is only with the introduction of statistical mechanics, the viewpoint that a thermodynamic system consists of a collection of particles and which explains classical thermodynamics in terms of probability distributions, that the entropy can be considered separately from temperature and energy. This is expressed in Boltzmann's famous entropy formula S=kB ln(W). Here kB is Boltzmann's constant, and W is the number of equally probable microstates which yield a particular thermodynamic state, or macrostate.

Boltzmann's equation is presumed to provide a link between thermodynamic entropy S and information entropy H = −Σi pi ln pi = ln(W) where pi=1/W are the equal probabilities of a given microstate. This interpretation has been criticized also. While some say that the equation is merely a unit conversion equation between thermodynamic and information entropy, this is not completely correct.[18] A unit conversion equation will, e.g., change inches to centimeters, and yield two measurements in different units of the same phyical quantity (length). Since thermodynamic and information entropy are dimensionally unequal (energy/unit temperature vs. units of information), Boltzmann's equation is more akin to x = c t where x is the distance travelled by a light beam in time t, c being the speed of light. While we cannot say that length x and time t represent the same physical quantity, we can say that, in the case of a light beam, since c is a universal constant, they will provide perfectly accurate measures of each other. (For example, the light-year is used as a measure of distance). Likewise, in the case of Boltzmann's equation, while we cannot say that thermodynamic entropy S and information entropy H represent the same physical quantity, we can say that, in the case of a thermodynamic system, since kB is a universal constant, they will provide perfectly accurate measures of each other.

The question then remains whether ln(W) is an information-theoretic quantity. If it is measured in bits, one can say that, given the macrostate, it represents the number of yes/no questions one must ask to determine the microstate, clearly an information-theoretic concept. Objectors point out that such a process is purely conceptual, and has nothing to do with the measurement of entropy. Then again, the whole of statistical mechanics is purely conceptual, serving only to provide an explanation of the "pure" science of thermodynamics.

Ultimately, the criticism of the link between thermodynamic entropy and information entropy is a matter of terminology, rather than substance. Neither side in the controversy will disagree on the solution to a particular thermodynamic or information-theoretic problem.

Topics of recent research[edit]

Is information quantized?[edit]

In 1995, Tim Palmer signalled[citation needed] two unwritten assumptions about Shannon's definition of information that may make it inapplicable as such to quantum mechanics:

  • The supposition that there is such a thing as an observable state (for instance the upper face of a dice or a coin) before the observation begins
  • The fact that knowing this state does not depend on the order in which observations are made (commutativity)

Anton Zeilinger's and Caslav Brukner's article[19] synthesized and developed these remarks. The so-called Zeilinger's principle suggests that the quantization observed in QM could be bound to information quantization (one cannot observe less than one bit, and what is not observed is by definition "random"). Nevertheless, these claims remain quite controversial. Detailed discussions of the applicability of the Shannon information in quantum mechanics and an argument that Zeilinger's principle cannot explain quantization have been published,[20][21][22] that show that Brukner and Zeilinger change, in the middle of the calculation in their article, the numerical values of the probabilities needed to compute the Shannon entropy, so that the calculation makes little sense.

Extracting work from quantum information in a Szilárd engine[edit]

In 2013, a description was published[23] of a two atom version of a Szilárd engine using Quantum discord to generate work from purely quantum information.[24] Refinements in the lower temperature limit were suggested.[25]

See also[edit]

References[edit]

  1. ^ Schneider, T.D, Information theory primer with an appendix on logarithms, National Cancer Institute, 14 April 2007.
  2. ^ Antoine Bérut; Artak Arakelyan; Artyom Petrosyan; Sergio Ciliberto; Raoul Dillenschneider; Eric Lutz (8 March 2012), "Experimental verification of Landauer’s principle linking information and thermodynamics" (PDF), Nature 483 (7388): 187–190, Bibcode:2012Natur.483..187B, doi:10.1038/nature10872 
  3. ^ Shoichi Toyabe; Takahiro Sagawa; Masahito Ueda; Eiro Muneyuki; Masaki Sano (2010-09-29). "Information heat engine: converting information to energy by feedback control". Nature Physics 6 (12): 988–992. arXiv:1009.5287. Bibcode:2011NatPh...6..988T. doi:10.1038/nphys1821. We demonstrated that free energy is obtained by a feedback control using the information about the system; information is converted to free energy, as the first realization of Szilard-type Maxwell’s demon. 
  4. ^ Karnani, M.; Pääkkönen, K.; Annila, A. (2009). "The physical character of information". Proc. R. Soc. A 465 (2107): 2155–75. Bibcode:2009RSPSA.465.2155K. doi:10.1098/rspa.2009.0063. 
  5. ^ Leon Brillouin (1953), "The negentropy principle of information", J. Applied Physics 24, 1152-1163.
  6. ^ Leon Brillouin, Science and Information theory, Dover, 1956
  7. ^ Mahulikar, S.P.; Herwig, H. (August 2009). "Exact thermodynamic principles for dynamic order existence and evolution in chaos". Chaos, Solitons & Fractals 41 (4): 1939–48. Bibcode:2009CSF....41.1939M. doi:10.1016/j.chaos.2008.07.051. 
  8. ^ Overbye, Dennis. "Hawking's Breakthrough Is Still an Enigma". New York Times. Retrieved 19 December 2013. 
  9. ^ Schiffer M, Bekenstein JD (February 1989). "Proof of the quantum bound on specific entropy for free fields". Physical Review D 39 (4): 1109–15. Bibcode:1989PhRvD..39.1109S. doi:10.1103/PhysRevD.39.1109. PMID 9959747.  "Black Holes and Entropy". Physical Review D 7 (8): 2333. Bibcode:1973PhRvD...7.2333B. doi:10.1103/PhysRevD.7.2333. Ellis, George Francis Rayner; Hawking, S. W. (1973). The large scale structure of space-time. Cambridge, Eng: University Press. ISBN 0-521-09906-4.  von Baeyer, Christian, H. (2003). Information — the New Language of Science. Harvard University Press. ISBN 0-674-01387-5.  Callaway DJE (April 1996). "Surface tension, hydrophobicity, and black holes: The entropic connection". Physical Review E 53 (4): 3738–3744. arXiv:cond-mat/9601111. Bibcode:1996PhRvE..53.3738C. doi:10.1103/PhysRevE.53.3738. PMID 9964684.  Srednicki M (August 1993). "Entropy and area". Physical Review Letters 71 (5): 666–669. arXiv:hep-th/9303048. Bibcode:1993PhRvL..71..666S. doi:10.1103/PhysRevLett.71.666. PMID 10055336. 
  10. ^ Hirschman, Jr., I.I. (January 1957). "A note on entropy". American Journal of Mathematics 79 (1): 152–6. doi:10.2307/2372390. JSTOR 2372390. 
  11. ^ Zachos, C. K. (2007). "A classical bound on quantum entropy". Journal of Physics A: Mathematical and Theoretical 40 (21): F407. arXiv:hep-th/0609148. Bibcode:2007JPhA...40..407Z. doi:10.1088/1751-8113/40/21/F02. 
  12. ^ Alexander Stotland; Pomeransky; Eitan Bachmat; Doron Cohen (2004). "The information entropy of quantum mechanical states". Europhysics Letters 67 (5): 700–6. arXiv:quant-ph/0401021. Bibcode:2004EL.....67..700S. doi:10.1209/epl/i2004-10110-1. 
  13. ^ Deacon, Terrence W. (2011). Incomplete Nature: How Mind Emerged from Matter. W.W. Norton & Co. p. 74‐75,380. The analogy (of Shannon entropy) to thermodynamic entropy breaks down because Shannon’s concept is a logical (or structural) property, not a dynamical property. Shannon entropy, for example, does not generally increase spontaneously in most communication systems, so there is no equivalent to the second law of thermodynamics when it comes to the entropy of information. The arrangement of units in a message doesn’t spontaneously ‘tend’ to change toward equiprobablity. 
  14. ^ Morowitz, Harold (986). "Entropy and Nonsense". Biology and Philosophy (Springer) 1 (4): 473–476. Retrieved 21 March 2016. Since C. E. Shannon introduced the information measure in 1948 and showed a formal analogy between the information measure (-∑ p, ln2 p, ) and the entropy measure of statistical mechanics (- ∑ f ln(f)), a number of works have appeared trying to relate "entropy" to all sorts of academic disciplines. Many of these theories involve profound confusion about the underlying thermal physics and their authors use the language and formulae of the physical sciences to bolster otherwise trivial and vacuous theories. 
  15. ^ Ben‐Naim, Arieh (2008). A Farewell to Entropy: Statistical Thermodynamics Based on Information. p. xviii. (quoting criticism)There is no invariant function corresponding to energy embedded in each of the hundreds of equations of information ‘entropy’ and thus no analog of temperature universally present in each of them. The point is that information ‘entropy’ in all of its myriad nonphysicochemical forms as a measure of information or abstract communication has no relevance to the evaluation of thermodynamic entropy change. 
  16. ^ Müller, Ingo (2007). A History of Thermodynamics: the Doctrine of Energy and Entropy. Springer. pp. 124–126. For level‐headed physicists, entropy—or order and disorder—is nothing by itself. It has to be seen and discussed in conjunction with temperature and heat, and energy and work. And, if there is to be an extrapolation of entropy to a foreign field, it must be accompanied by the appropriate extrapolations of temperature, heat, and work. 
  17. ^ Rapoport, Anatol. (1976). "General Systems Theory: a Bridge Between Two Cultures". Systems Research and behavioral science (University of Birmingham) 21 (4): 228–239. doi:10.1002/bs.3830210404. Retrieved 21 March 2016. In thermodynamic terms entropy is defined in terms of the relation between energy and temperature. In communication theory entropy refers to the uncertainty associated with messages. A more far‐fetched connection is difficult to imagine, but it has been conclusively demonstrated by the mathematical isomorphism between the two. 
  18. ^ Ben-Naim, Arieh (2012). Entropy and the Second Law: Interpretation and Misss-Interpretationsss. ISBN 978-9-814-40755-7. 
  19. ^ "Conceptual inadequacy of the Shannon information in quantum measurement". 2001. arXiv:quant-ph/0006087. Bibcode:2001PhRvA..63b2113B. doi:10.1103/PhysRevA.63.022113. 
  20. ^ Timpson, 2003
  21. ^ Hall, 2000
  22. ^ Mana, 2004
  23. ^ Jung Jun Park, Kang-Hwan Kim, Takahiro Sagawa, Sang Wook Kim (2013). "Heat Engine Driven by Purely Quantum Information.". Physical Review Letters 111 (23). arXiv:1302.3011. Bibcode:2013PhRvL.111w0402P. doi:10.1103/PhysRevLett.111.230402. Retrieved 19 December 2013. 
  24. ^ Zyga, Lisa. "Maxwell's demon can use quantum information". Phys.org (Omicron Technology Limited). Retrieved 19 December 2013. 
  25. ^ Martin Plesch, Oscar Dahlsten, John Goold, Vlatko Vedral (September 2013). "Comment on "Quantum Szilard Engine"". Phys. Rev. Lett. arXiv:1309.4209. Bibcode:2013PhRvL.111r8901P. doi:10.1103/PhysRevLett.111.188901. 

Additional references[edit]

External links[edit]