Microcanonical ensemble


The microcanonical ensemble is an ensemble whose Probability density function is

ρ(q,p)={constantE<H(q,p)<E+Δ0otherwise\rho(\mathbf{q},\mathbf{p})=\begin{cases} \text{constant} & E<H(\mathbf{q},\mathbf{p})<E+\Delta \\ 0&\text{otherwise} \end{cases}

where HH is the Hamiltonian of the system and EE is its total energy. It represents an isolated system in thermal equilibrium whose energy EE is precisely defined and conserved. Given this definition, the equal a priori probability hypothesis applies to it directly, hence the constant density function.

In practice, we instead consider the system energy to be in an interval [E,E+Δ][E,E+\Delta] to account for practical uncertainties in measurement, since our understanding of energy is never going to be without error. Nevertheless, Δ\Delta is to be seen as a tiny constant with respect to the energy (ΔE\Delta\ll E) and likewise [E,E+Δ][E,E+\Delta] is a tiny interval.

It differs from the canonical and grand canonical ensembles, both of which are not isolated and have energy fluctuations.

Expectation values

Given some dynamical variable f(q,p)f(\mathbf{q},\mathbf{p}), its most likely value (the mode) is the one that ff takes in the largest number of systems. The mean on the other hand is the usual ensemble average:

f=f(q,p)ρ(q,p)d3Nqd3Npρ(q,p)d3Nqd3Np\langle f \rangle = \frac{\int f(\mathbf{q},\mathbf{p})\rho(\mathbf{q},\mathbf{p})\,d^{3N}q\,d^{3N}p}{\int \rho(\mathbf{q},\mathbf{p})\,d^{3N}q\,d^{3N}p}

The mean and mode tend to coincide if the Mean squared error of ff tends to zero. Empirically, the MSE is inversely proportional to the number of particles in the system NN: MSE1/N\text{MSE}\propto 1/N. So for large NN, the mean and the mode coincide.

Entropy

To find the entropy function for the ensemble, let's introduce a function Γ(E)\Gamma(E) which counts the number of states whose energy is between EE and E+ΔE+\Delta:

Γ(E)=E<H(q,p)<E+Δd3Nqd3Np\Gamma(E)=\int\limits_{E<H(\mathbf{q},\mathbf{p})<E+\Delta}d^{3N}q\,d^{3N}p

This function explicitly depends on EE, but also on the number of particles NN (the integration variables) and the volume VV (in the integration limits due to q\mathbf{q}). Γ\Gamma returns a volume in phase space, which is a finite spherical shell bounded by two hypersurfaces of energies EE and E+ΔE+\Delta.

Let's introduce another function Σ(E)\Sigma(E) that counts all states with energy below EE:

Σ(E)=H(q,p)<Ed3Nqd3Np\Sigma(E)=\int\limits_{H(\mathbf{q},\mathbf{p})<E}d^{3N}q\,d^{3N}p

This is the volume of a sphere of radius EE. Of course, a spherical shell is just the difference between two spheres of different radii, and so the two functions are related by

Γ(E)=Σ(E+Δ)Σ(E)\Gamma(E)=\Sigma(E+\Delta)-\Sigma(E)

Since the thickness Δ\Delta of the shell is small compared to the radius (ΔE\Delta\ll E), the volume of the shell can be well approximated as

Γ(E)=ΔΣ(E)E=Δω(E)\Gamma(E)=\Delta\cdot\frac{ \partial \Sigma(E) }{ \partial E }=\Delta \cdot\omega(E)

Since Γ\Gamma is a number of states, we can reasonably assume the entropy comes from Boltzmann's definition of entropy:

S(E,V)=klogΓ(E,V)S(E,V)=k\log \Gamma(E,V)

where kk is some constant (presumably the Boltzmann constant). However, in order to definitively prove this is entropy we need to prove that

  1. SS is extensive;
  2. SS obeys the second law of thermodynamics.

It can be proven that the entropy can be calculated from any of Γ\Gamma, ω\omega and Σ\Sigma, and that they are all equal up to an additive constant:

S=kBlogΓ(E)S=kBlogω(E)S=kBlogΣ(E)S=k_{B}\log \Gamma(E)\qquad S=k_{B}\log \omega(E)\qquad S=k_{B}\log \Sigma(E)

Extensiveness

Let's start from the first. Consider some chamber divided in two volumes V1V_{1} and V2V_{2}, respectively containing N1N_{1} and N2N_{2} particles.

center

If entropy is extensive, then the total entropy must be given as the combination of the entropy of each subsystem1. Entropy measures disorder, so if the two systems heavily interact with each other, that'll increase entropy by a lot. For this proof, let's assume that the energy of interaction EintE_\text{int} between 1 and 2 is much smaller than the energy of each system, Eint<E1E_\text{int}<E_{1} and Eint<E2E_\text{int}<E_{2}. This allows us to consider the two as non-interacting and leaves us with just the internal entropies. For the interaction to be negligible we need two things:

  1. the range of interaction between particles must be finite (we'll send the volume to infinity later so any finite value is fine). This excludes long-range interactions like gravity from this proof;
  2. the contact-surface-to-volume ratio must be negligible as the volume goes infinite. This excludes weird, possibly nonphysical contact surfaces from this proof.

If these are true, then the Hamiltonian is separable: H(q,p)=H1(q1,p1)+H2(q2,p2)H(\mathbf{q},\mathbf{p})=H_{1}(\mathbf{q}_{1},\mathbf{p}_{1})+H_{2}(\mathbf{q}_{2},\mathbf{p}_{2}). To continue, let's define the entropies separately as S1=klogΓ1(E1)S_{1}=k\log \Gamma_{1}(E_{1}) and S2=klogΓ2(E2)S_{2}=k\log \Gamma_{2}(E_{2}). Γ1\Gamma_{1} and Γ2\Gamma_{2} are volumes in the respective phase spaces (q1,p1)(\mathbf{q}_{1},\mathbf{p}_{1}) and (q2,p2)(\mathbf{q}_{2},\mathbf{p}_{2}). We'd like to find the total entropy. Our simplest option is to just sum the two:

S=klogΓ1(E1)+logΓ2(E2)=klog[Γ1(E1)Γ2(E2)]S=k\log \Gamma_{1}(E_{1})+\log \Gamma_{2}(E_{2})=k\log[\Gamma_{1}(E_{1})\Gamma_{2}(E_{2})]

The product Γ1(E1)Γ2(E2)\Gamma_{1}(E_{1})\Gamma_{2}(E_{2}) is the number of total states whose energy is between E1+E2E_{1}+E_{2} and E1+E2+2ΔE_{1}+E_{2}+2\Delta2. However, we'd be wrong. The reason is that E1E_{1} and E2E_{2} are, as it stands, undefined. They are just one possible subdivision of the total energy, which leads to only possible set of states. What we need to do is, on top of merging the states with this product, we also need to consider every possible pair of E1E_{1} and E2E_{2} that we split the system into (there's nothing preventing any combination, so they are all valid states that need to be counted)3. To count the pairs, we divide E1E_{1} and E2E_{2} into intervals of width Δ\Delta, starting from zero. This division requires there to be an energy minimum. Now, by way of energy conservation, an energy pair can be written as EiE_{i} and EEiE-E_{i}, where ii is the label for a given interval. With this, we can find the total number of states by summing over every interval:

Γ(E)=i=1E/ΔΓ1(Ei)Γ2(EEi)\Gamma(E)=\sum_{i=1}^{E/\Delta} \Gamma_{1}(E_{i})\Gamma_{2}(E-E_{i})

If we plug this back in the Boltzmann entropy we get

S=klog[i=1E/ΔΓ1(Ei)Γ2(EEi)]S=k\log \left[ \sum_{i=1}^{E/\Delta} \Gamma_{1}(E_{i})\Gamma_{2}(E-E_{i}) \right]

Let Γ1(Eˉ1)Γ2(Eˉ2)\Gamma_{1}(\bar{E}_{1})\Gamma_{2}(\bar{E}_{2}) with Eˉ1+Eˉ2=E\bar{E}_{1}+\bar{E}_{2}=E be the largest term in this sum. We must have

Γ1(Eˉ1)Γ2(Eˉ2)Γ(E)EΔΓ1(Eˉ1)Γ2(Eˉ2)\Gamma_{1}(\bar{E}_{1})\Gamma_{2}(\bar{E}_{2})\leq \Gamma(E)\leq \frac{E}{\Delta}\Gamma_{1}(\bar{E}_{1})\Gamma_{2}(\bar{E}_{2})

The first inequality is true because Γ1(Eˉ1)Γ2(Eˉ2)\Gamma_{1}(\bar{E}_{1})\Gamma_{2}(\bar{E}_{2}) is just one term in the sum for Γ(E)\Gamma(E). The second is true because E/ΔE/\Delta is arbitrarily large (since Δ\Delta is arbitrarily small). If we make the switch to entropies, the inequalities become

klog[Γ1(Eˉ1)Γ2(Eˉ2)]S(E,V)klog[Γ1(Eˉ1)Γ2(Eˉ2)]+klogEΔk\log [\Gamma_{1}(\bar{E}_{1})\Gamma_{2}(\bar{E}_{2})]\leq S(E,V)\leq k\log[\Gamma_{1}(\bar{E}_{1})\Gamma_{2}(\bar{E}_{2})]+k\log \frac{E}{\Delta}

See how the only difference in between the outer inequalities is the klogE/Δk\log E/\Delta term. If it were to vanish, the entropy S(E,V)S(E,V) would exactly be klog[Γ1(Eˉ1)Γ2(Eˉ2)]k\log[\Gamma_{1}(\bar{E}_{1})\Gamma_{2}(\bar{E}_{2})]. To proceed, note how logΓi(Ei)Ni\log \Gamma_{i}(E_{i})\sim N_{i}4, but also how EN1+N2=NE\propto N_{1}+N_{2}=N and therefore logE/ΔlogN\log E/\Delta \sim \log N. Clearly, the sum goes something like

klogΓ1Γ2+klogEΔN+logNk\log \Gamma_{1}\Gamma_{2}+k\log \frac{E}{\Delta}\sim N+\log N

NN is something massive, like 1023\sim 10^{23} type massive, so this sum looks like 1023+2310^{23}+23 (here using log10\log_{10}, but the point stands for any basis). Evidently, the second term is completely insignificant and we might as well set it to zero. But if we do this, we are left with

S(E,V)klog[Γ1(Eˉ1)Γ2(Eˉ2)]=klogΓ1(Eˉ1)+klogΓ2(Eˉ2)=S1+S2S(E,V)\simeq k\log[\Gamma_{1}(\bar{E}_{1})\Gamma_{2}(\bar{E}_{2})]=k\log \Gamma_{1}(\bar{E}_{1})+k\log \Gamma_{2}(\bar{E}_{2})=S_{1}+S_{2}

And so we proved that SS is (approximately) additive over subsystems, i.e. it is extensive. Furthermore, the only term that we are left with is the largest contribution alone, which means that the only term that matters in the sum is the one with highest entropy.

Equilibrium temperature

To prove that SS obeys the second law of thermodynamics, we'll need to check if the state with maximum entropy is also the most likely. In other words, the state we found above, with energies (Eˉ1,Eˉ2)(\bar{E}_{1},\bar{E}_{2}), must be so overwhelmingly likely as to make every other state look impossibly rare. Since the state density is uniform (because we said so in the definition of the microcanonical ensemble), being the most likely macrostate is the same as having the largest count of equivalent microstates. As such, Eˉ1\bar{E}_{1} and Eˉ2\bar{E}_{2} must be such that they maximize Γ1(E1)Γ2(E2)\Gamma_{1}(E_{1})\Gamma_{2}(E_{2}) under the constraint E1+E2=EE_{1}+E_{2}=E. We can use Lagrange multipliers to find the maximum under this bounded set. (Eˉ1,Eˉ2)(\bar{E}_{1},\bar{E}_{2}) is a stationary point, for which

(Γ1(E1)Γ2(E2))=0,E1+E2=0\nabla(\Gamma_{1}(E_{1})\Gamma_{2}(E_{2}))=0,\qquad \partial E_{1}+\partial E_{2}=0

From this we have

E1logΓ1(E1)Eˉ1E2logΓ2(E2)Eˉ2=0\left.\frac{ \partial }{ \partial E_{1} } \log \Gamma_{1}(E_{1})\right|_{\bar{E}_{1}}-\left.\frac{ \partial }{ \partial E_{2} } \log \Gamma_{2}(E_{2})\right|_{\bar{E}_{2}}=0

and so

S1(E1)E1E1=Eˉ1=S2(E2)E2E2=Eˉ2\left.\frac{ \partial S_{1}(E_{1}) }{ \partial E_{1} }\right|_{E_{1}=\bar{E}_{1}}=\left.\frac{ \partial S_{2}(E_{2}) }{ \partial E_{2} }\right|_{E_{2}=\bar{E}_{2}}

If we introduce the temperature through the Maxwell relations like

T=(US)VT=\left( \frac{ \partial U }{ \partial S } \right)_{V}

we can define the inverse by inverting the above relation (and calling UEU\to E)

1T=S(E,V)E\frac{1}{T}=\frac{ \partial S(E,V) }{ \partial E }

and substituting it above we get

T1=T2T_{1}=T_{2}

which is exactly what we should expect as we are studying the system in the context of thermal equilibrium. But the definition of temperature we used comes from absolute temperature, which inherently relies on the Boltzmann constant. Therefore, kk has to be precisely the Boltzmann constant kBk_{B}, else the definition of TT would no longer hold.

As a side note, T1=T2T_{1}=T_{2} holds for any two subsystems, which means that the global temperature of a isolated system described by the microcanonical ensemble also regulates the temperature (and therefore thermal equilibrium) of all of its components.

Second law

The only thing left to prove is that SS obeys the second law of thermodynamics. Thankfully this is straight-forward: the second law says that when a thermodynamic systems goes from one state to another through a transformation, the entropy does not decrease. In a microcanonical ensemble, entropy is a function of EE, NN and VV. But EE and NN are constant, so the only thing we can change to vary the state is VV. But if VV is increased, then Σ(E)\Sigma(E) must also increase as we are integrating over more states (recall that the integration bounds are affected by VV). But since S=kBlogΣ(E)S=k_{B}\log \Sigma(E), and Σ(E)\Sigma(E) can only increase, then SS can also only increase. Thus, the second law is preserved, and with that we proved that SS is formally entropy.

Connection to information theory

Consider the definition of entropy in function of information-theoretical entropy, S=kBSinfS=k_{B}S_\text{inf}. Note that this has the same form as the above three, just with SinfS_\text{inf} instead of the log\log of Γ\Gamma, ω\omega or Σ\Sigma. This suggests that all three of these are formally types of information-theoretical entropy, which then gets converted into physical entropy by the Boltzmann constant.

Examples

> With some combinatorics, the number of state combinations due to $n_{1}$ is > $$W=\begin{pmatrix}N \\ n_{1}\end{pmatrix}=\frac{N!}{n_{1}!(N-n_{1})!}

using the binomial coefficient. The entropy therefore is

> Dividing entropy by the Boltzmann constant we get > $$\frac{S}{k_{B}}=\log N!-\log n_{1}!-\log[(N-n_{1})!]\simeq N\log N- \frac{U}{E}\log \frac{U}{E}- \left( N- \frac{U}{E} \right)\log\left( N- \frac{U}{E} \right)

using Stirling's approximation and n1=U/En_{1}=U/E.

As for the temperature, we have from the Maxwell relations:

> We use the inverse just because it's easier, as we already have $S$ anyway. So > $$\frac{1}{T}=\frac{k_{B}}{E} \log\left( \frac{NE-U}{U} \right)

So

> Note that this temperature could very well be negative. This, of course, is non-physical, but it's happening here regardless. The reason why it happens is that the system is being bounded in energy from above, that is, there is a maximum energy that the system cannot cross. The unphysical nature of negative temperature implies that systems with an energy maximum cannot exist, or at most exist as approximation of unbounded systems. This is the case for [[Two-level system|two-level systems]], which in practice do exist, but only within a short period of time. For instance, a [[Qubit]] is a two-level system during the short period of its measurement, but if we were to observe one over long periods of time, it would not remain bounded to two levels, as that would imply the possibility of negative temperature. > > We can also invert the last equation to find the internal energy of the system: > $$U=\frac{NE}{e^{\beta E}+1}

or alternatively, using n1=U/En_{1}=U/E, the fraction of particles in the n1n_{1} state:

> This is the [[Fermi-Dirac distribution]] for a system with no [[chemical potential]]. It is interesting to see how we managed to find a strictly quantum property from a purely classical treatment. The reason is that a two state system is inherently a quantum idea as it requires the discretization of energy. > > We can also derive the [[heat capacity]] from its definition: > $$C(T)=\frac{\partial U}{\partial T}=\frac{NE^{2}}{k_{B}T^{2}} \frac{e^{\beta E}}{e^{\beta E}+1}

Footnotes

  1. It doesn't necessarily need to be a simple sum (e.g. due to interactions), but it needs to combine so that the total has terms for each subsystem.

  2. It's a product and not a sum because we want all possible combinations of states of 1 and 2. For example, if 1 and 2 both have 3 states each, there are 3×3=93\times 3=9 combinations, not 3+3=63+3=6. If we call the states a,b,ca,b,c for 1 and x,y,zx,y,z for 2, the combinations are ax,ay,az,bx,by,bz,cx,cy,czax,ay,az,bx,by,bz,cx,cy,cz.

  3. At this point it's useful to remember that our original division into two systems is completely arbitrary. Systems 1 and 2 don't exists, they're fictitious, they're just a tool to count possible states, so any pair of these is eligible.

  4. This is because Γ\Gamma counts combinations, which are given by N!N!, so ΓN!\Gamma \sim N!. Stirling's approximation then yields logΓN\log \Gamma\sim N.