Given a function f:A⊂RN→RM where A is an open subset, and a point x0∈A, the function is said to be differentiable in x0 if there exist a linear map1Df(x0):RN→RM, called the differential, such that
h→0lim∥h∥f(x0+h)−f(x0)−Df(h)=0∈RM
where h∈V. The differential is
Df(x0)[v]≡h→0lim∥h∥f(x0−hv)−f(x0),v∈RN
and represents a generalized form of the idea of "rate of change" that the normal derivative represents. This rate of change is evaluated in a certain point x0 in space and in the direction given by v.
If we set M=1, f is instead a real-valued function f:A⊂RN→R. Its differential is a linear Functional and is typically denoted as df:RN→R such that v→df(x0)[v]. Being a linear function, it is a member of the Dual vector space of V. Like before,
df(x0)[ei]=∂xi∂f(x0)
If we specifically take the functions x1,…,xn:RN→R that extract the i-th component of a vector v=(v1,…,vn)∈V, so that xi(v)=vi, the differentials of these functions {dxi}i=1,…,N make a dual basis of {ei}i=1,…,N. df(x) can be expressed in this basis as
df(x0)=i=1∑N∂xi∂f(x0)dxi
The differential in general goes by a few names, total/exact derivative/differential, but the name "total derivative" is especially common to refer to the form above, used to extend the concept of derivative in the most straightforward way possible. For instance, the total derivative of a function f(x(t),y(t),t) in t is, using the Chain rule,
If the differential is a linear operator from RN to RM, it can be represented as an N×M square matrix. This matrix is exactly the Jacobian matrix of f. In fact, for a linear operator, its matrix elements in a given basis {ei}i=1,…,N are given by
Jij=ei⋅Df(x0)[ej]=ei⋅∂xj∂f=∂xj∂fi
which is the matrix of all first-order partial derivatives of f, i.e. its Jacobian.