Differential


Given a function f:ARNRMf:A\subset \mathbb{R}^{N}\to \mathbb{R}^{M} where AA is an open subset, and a point x0Ax_{0}\in A, the function is said to be differentiable in x0x_{0} if there exist a linear map1 Df(x0):RNRMDf(x_{0}):\mathbb{R}^{N}\to \mathbb{R}^{M}, called the differential, such that

limh0f(x0+h)f(x0)Df(h)h=0RM\lim\limits_{h \to 0} \frac{f(x_{0}+h) - f(x_{0})-Df(h)}{\lVert h \rVert } = 0\in\mathbb{R}^M

where hVh\in V. The differential is

Df(x0)[v]limh0f(x0hv)f(x0)h,vRNDf(x_{0})[v]\equiv \lim_{ h \to 0 } \frac{f(x_{0}-hv)-f(x_{0})}{\lVert h \rVert },\qquad v\in \mathbb{R}^{N}

and represents a generalized form of the idea of "rate of change" that the normal derivative represents. This rate of change is evaluated in a certain point x0x_{0} in space and in the direction given by vv.

When evaluated on a Basis vector eie_{i} in a Vector space VV, the differential becomes a partial derivative over that coordinate:

Df(x0)[ei]=fxi(x0)Df(x_{0})[e_{i}]=\frac{ \partial f }{ \partial x_{i} } (x_{0})

For real-valued functions

If we set M=1M=1, ff is instead a real-valued function f:ARNRf:A\subset \mathbb{R}^{N}\to \mathbb{R}. Its differential is a linear Functional and is typically denoted as df:RNRdf:\mathbb{R}^{N}\to \mathbb{R} such that vdf(x0)[v]v\to df(x_{0})[v]. Being a linear function, it is a member of the Dual vector space of VV. Like before,

df(x0)[ei]=fxi(x0)df(x_{0})[e_{i}]=\frac{ \partial f }{ \partial x_{i} } (x_{0})

If we specifically take the functions x1,,xn:RNRx_{1},\ldots,x_{n}:\mathbb{R}^{N}\to \mathbb{R} that extract the ii-th component of a vector v=(v1,,vn)Vv=(v_{1},\ldots,v_{n})\in V, so that xi(v)=vix_{i}(v)=v_{i}, the differentials of these functions {dxi}i=1,,N\{ dx_{i} \}_{i=1,\ldots,N} make a dual basis of {ei}i=1,,N\{ e_{i} \}_{i=1,\ldots,N}. df(x)df(x) can be expressed in this basis as

df(x0)=i=1Nfxi(x0)dxidf(x_{0})=\sum_{i=1}^{N} \frac{ \partial f }{ \partial x_{i} } (x_{0})dx_{i}

The differential in general goes by a few names, total/exact derivative/differential, but the name "total derivative" is especially common to refer to the form above, used to extend the concept of derivative in the most straightforward way possible. For instance, the total derivative of a function f(x(t),y(t),t)f(x(t),y(t),t) in tt is, using the Chain rule,

dfdt=i=12fxidxidt+ftdtdt=fxdxdt+fydydt+ft\frac{df}{dt}=\sum_{i=1}^{2} \frac{ \partial f }{ \partial x_{i} } \frac{dx_{i}}{dt}+\frac{ \partial f }{ \partial t } \frac{dt}{dt}=\frac{ \partial f }{ \partial x } \frac{dx}{dt}+\frac{ \partial f }{ \partial y } \frac{dy}{dt}+\frac{ \partial f }{ \partial t }

It is related to the Gradient by

df(x0)[v]=i=1Nfxi(x0)dxi v=i=1Nfxi(x0)dxij=1Nviei=i=1Nfxij=1Nvjdxi[ej]δij==i=1Nfxivi=fv\begin{align} df(x_{0})[v]&=\sum_{i=1}^{N} \frac{ \partial f }{ \partial x_{i} } (x_{0})dx_{i}\ v=\sum_{i=1}^{N} \frac{ \partial f }{ \partial x_{i} } (x_{0})dx_{i}\sum_{j=1}^{N} v_{i}e_{i}=\sum_{i=1}^{N} \frac{ \partial f }{ \partial x_{i} } \sum_{j=1}^{N} v_{j}\underbrace{ dx_{i}[e_{j}] }_{ \delta_{ij} }= \\ &=\sum_{i=1}^{N} \frac{ \partial f }{ \partial x_{i} } v_{i}=\nabla f\cdot v \end{align}

using the Kronecker delta δij\delta_{ij}.

When dxidx_{i} is applied onto the basis vector eje_{j} of VV we get

dxi[ej]=xixj=δijdx_{i}[e_{j}]=\frac{ \partial x_{i} }{ \partial x_{j} } =\delta_{ij}

Matrix representation

If the differential is a linear operator from RN\mathbb{R}^{N} to RM\mathbb{R}^{M}, it can be represented as an N×MN\times M square matrix. This matrix is exactly the Jacobian matrix of ff. In fact, for a linear operator, its matrix elements in a given basis {ei}i=1,,N\{ e_{i} \}_{i=1,\ldots,N} are given by

Jij=eiDf(x0)[ej]=eifxj=fixjJ_{ij}=e_{i}\cdot Df(x_{0})[e_{j}]=e_{i}\cdot \frac{ \partial f }{ \partial x_{j} } =\frac{ \partial f_{i} }{ \partial x_{j} }

which is the matrix of all first-order partial derivatives of ff, i.e. its Jacobian.

Footnotes

  1. Here linear map does not necessarily mean linear function. It is a general operation which obeys the general linear property f(αv+βw)=αf(v)+βf(w)f(\alpha v+\beta w)=\alpha f(v)+\beta f(w).