The second law of thermodynamics can be described by using the Lagrange multipliers method to maximize entropy .
Consider an ensemble with information-theoretical entropy S = − k B ∑ x p ( x ) log p ( x ) S=-k_{B}\sum_{x}p(x)\log p(x) S = − k B ∑ x p ( x ) log p ( x ) , where p ( x ) p(x) p ( x ) is the Probability that the ensemble is in state x x x . As usual, ∑ x p ( x ) = 1 \sum_{x}p(x)=1 ∑ x p ( x ) = 1 . Its internal energy is U = ⟨ E ⟩ = ∑ x E ( x ) p ( x ) U=\langle E \rangle =\sum_{x}E(x)p(x) U = ⟨ E ⟩ = ∑ x E ( x ) p ( x ) 1 . The constraint functions are
g 1 ( x ) = 1 − ∑ x p ( x ) g_{1}(x)=1-\sum_{x}p(x) g 1 ( x ) = 1 − x ∑ p ( x )
which determines the completeness of probabilities and
g 2 ( x ) = U − ∑ x E ( x ) p ( x ) g_{2}(x)=U-\sum_{x}E(x)p(x) g 2 ( x ) = U − x ∑ E ( x ) p ( x )
which determines the internal energy.
The Lagrangian L \mathcal{L} L for these constraints is
L ( x ) = S ( x ) + λ 1 g 1 ( x ) + λ 2 g 2 ( x ) = − k B ∑ x p ( x ) log p ( x ) + λ 1 ( 1 − ∑ x p ( x ) ) + λ 2 ( U − ∑ x p ( x ) E ( x ) ) \begin{align}
\mathcal{L}(x) &= S(x) +\lambda_{1}g_{1}(x) +\lambda_{2}g_{2}(x) \\
&=-k_{B}\sum_{x}p(x)\log p(x)+\lambda_{1}\left( 1-\sum_{x}p(x) \right)+\lambda_{2}\left( U-\sum_{x}p(x)E(x) \right)
\end{align} L ( x ) = S ( x ) + λ 1 g 1 ( x ) + λ 2 g 2 ( x ) = − k B x ∑ p ( x ) log p ( x ) + λ 1 ( 1 − x ∑ p ( x ) ) + λ 2 ( U − x ∑ p ( x ) E ( x ) )
where k B k_{B} k B is the Boltzmann constant . The Lagrange multiplier theorem tells us that if some value x ˉ \bar{x} x ˉ is a maximum of S S S , then there exist specific values of λ 1 \lambda_{1} λ 1 and λ 2 \lambda_{2} λ 2 such that x ˉ \bar{x} x ˉ is a stationary point for L \mathcal{L} L :
If S ( x ˉ ) is a maximum ⇒ ∇ L ( x ˉ ; λ 1 , λ 2 ) = 0 \text{If }S(\bar{x})\text{ is a maximum}\quad\Rightarrow \quad \nabla \mathcal{L}(\bar{x};\lambda_{1},\lambda_{2})=0 If S ( x ˉ ) is a maximum ⇒ ∇ L ( x ˉ ; λ 1 , λ 2 ) = 0
In our case, L \mathcal{L} L is univariate, so the Gradient is just the derivative in p p p :
d L d p = 0 = d d p ( − k B ∑ x p ( x ) log p ( x ) ) + λ 1 d d p ( 1 − ∑ p ( x ) ) + λ 2 d d p ( U − ∑ x p ( x ) E ( x ) ) = − k B ∑ x d d p ( p log p ) − λ 1 ∑ x d d p p − λ 2 ∑ x d d p p E = − k B ∑ x ( log p + 1 ) − λ 1 ∑ x 1 − λ 2 ∑ x E = − ∑ x [ k B log p ( x ) + k B + λ 1 + λ 2 E ( x ) ] \begin{align}
\frac{d\mathcal{L}}{d p}&=0 \\
&=\frac{d}{dp}\left( -k_{B}\sum_{x}p(x)\log p(x) \right)+ \lambda_{1}\frac{d}{dp}\left(1 -\sum p(x) \right)+ \lambda_{2}\frac{d}{dp}\left( U-\sum_{x}p(x)E(x) \right) \\
&=-k_{B}\sum_{x} \frac{d }{d p }(p\log p)-\lambda_{1}\sum_{x} \frac{d}{dp} p-\lambda_{2}\sum_{x}\frac{ d }{ dp } pE \\
&=-k_{B}\sum_{x}(\log p+1)-\lambda_{1}\sum_{x}1-\lambda_{2}\sum_{x}E \\
&=-\sum_{x}[k_{B}\log p(x)+k_{B}+\lambda_{1}+\lambda_{2}E(x) ]
\end{align} d p d L = 0 = d p d ( − k B x ∑ p ( x ) log p ( x ) ) + λ 1 d p d ( 1 − ∑ p ( x ) ) + λ 2 d p d ( U − x ∑ p ( x ) E ( x ) ) = − k B x ∑ d p d ( p log p ) − λ 1 x ∑ d p d p − λ 2 x ∑ d p d pE = − k B x ∑ ( log p + 1 ) − λ 1 x ∑ 1 − λ 2 x ∑ E = − x ∑ [ k B log p ( x ) + k B + λ 1 + λ 2 E ( x )]
Each term in the sum must individually be zero because they are all independent from each other:
0 = k B log p ( x ) + k B + λ 1 + λ 2 E ( x ) 0=k_{B}\log p(x)+k_{B}+\lambda_{1}+\lambda_{2}E(x) 0 = k B log p ( x ) + k B + λ 1 + λ 2 E ( x )
Extracting log p ( x ) \log p(x) log p ( x ) we get
log p ( x ) = k B + λ 1 + λ 2 E ( x ) − k B = − k B − λ 1 − λ 2 E ( x ) k B \log p(x)=\frac{k_{B}+\lambda_{1}+\lambda_{2}E(x)}{-k_{B}}= \frac{-k_{B}-\lambda_{1}-\lambda_{2}E(x)}{k_{B}} log p ( x ) = − k B k B + λ 1 + λ 2 E ( x ) = k B − k B − λ 1 − λ 2 E ( x )
Therefore
p ( x ) = e ( − k B − λ 1 − λ 2 E ( x ) ) / k B p(x)=e^{(-k_{B}-\lambda_{1}-\lambda_{2}E(x))/k_{B}} p ( x ) = e ( − k B − λ 1 − λ 2 E ( x )) / k B
This equation must satisfy probability normalization :
∑ x p ( x ) = 1 = ∑ x e ( − k B − λ 1 ) / k B e − λ 2 E ( x ) / k B = e ( − k B − λ 1 ) / k B Z \sum_{x}p(x)=1=\sum_{x}e^{(-k_{B}-\lambda_{1})/k_{B}}e^{-\lambda_{2}E(x)/k_{B}}=e^{(-k_{B}-\lambda_{1})/k_{B}}Z x ∑ p ( x ) = 1 = x ∑ e ( − k B − λ 1 ) / k B e − λ 2 E ( x ) / k B = e ( − k B − λ 1 ) / k B Z
where we introduced the partition function Z Z Z
Z ≡ ∑ x e − λ 2 E ( x ) / k B Z\equiv\sum_{x}e^{-\lambda_{2}E(x)/k_{B}} Z ≡ x ∑ e − λ 2 E ( x ) / k B
We can therefore write
p ( x ) = 1 Z e − λ 2 E ( x ) / k B p(x)=\frac{1}{Z} e^{-\lambda_{2}E(x)/k_{B}} p ( x ) = Z 1 e − λ 2 E ( x ) / k B
Note how the probability depends only on the second multiplier. Now that we know the probabilities, we can calculate entropy directly
S = − k B ∑ x p ( x ) log p ( x ) = − k B ∑ x p ( x ) [ − log Z − λ 2 E ( x ) k B ] = k B log Z ∑ x p ( x ) ⏟ 1 + λ 2 ∑ x p ( x ) E ( x ) ⏟ U = k B log Z + λ 2 U \begin{align}
S&=-k_{B}\sum_{x}p(x)\log p(x) \\
&=-k_{B}\sum_{x}p(x)\left[ -\log Z- \frac{\lambda_{2}E(x)}{k_{B}} \right] \\
&=k_{B}\log Z\underbrace{ \sum_{x}p(x) }_{ 1 }+\lambda_{2}\underbrace{ \sum_{x}p(x)E(x) }_{ U } \\
&=k_{B}\log Z+\lambda_{2}U
\end{align} S = − k B x ∑ p ( x ) log p ( x ) = − k B x ∑ p ( x ) [ − log Z − k B λ 2 E ( x ) ] = k B log Z 1 x ∑ p ( x ) + λ 2 U x ∑ p ( x ) E ( x ) = k B log Z + λ 2 U
We want to determine what λ 2 \lambda_{2} λ 2 is. To do this, we can see that
∂ S ∂ U = λ 2 \frac{ \partial S }{ \partial U } =\lambda_{2} ∂ U ∂ S = λ 2
and using the Maxwell relation
T = ∂ U ∂ S → 1 T = ∂ S ∂ U T=\frac{ \partial U }{ \partial S } \quad\to \quad \frac{1}{T}=\frac{ \partial S }{ \partial U } T = ∂ S ∂ U → T 1 = ∂ U ∂ S
we can state
λ 2 = 1 T \boxed{\lambda_{2}=\frac{1}{T}} λ 2 = T 1
where T T T is the temperature . Therefore, entropy is
S = U T + k B log Z \boxed{S=\frac{U}{T}+k_{B}\log Z} S = T U + k B log Z