Gateaux derivative

Generalization of the concept of directional derivative

Part of a series of articles about
Calculus
a b f ( t ) d t = f ( b ) f ( a ) {\displaystyle \int _{a}^{b}f'(t)\,dt=f(b)-f(a)}
  • Rolle's theorem
  • Mean value theorem
  • Inverse function theorem
Differential
Definitions
Concepts
  • Differentiation notation
  • Second derivative
  • Implicit differentiation
  • Logarithmic differentiation
  • Related rates
  • Taylor's theorem
Rules and identities
  • v
  • t
  • e

In mathematics, the Gateaux differential or Gateaux derivative is a generalization of the concept of directional derivative in differential calculus. Named after René Gateaux, it is defined for functions between locally convex topological vector spaces such as Banach spaces. Like the Fréchet derivative on a Banach space, the Gateaux differential is often used to formalize the functional derivative commonly used in the calculus of variations and physics.

Unlike other forms of derivatives, the Gateaux differential of a function may be a nonlinear operator. However, often the definition of the Gateaux differential also requires that it be a continuous linear transformation. Some authors, such as Tikhomirov (2001), draw a further distinction between the Gateaux differential (which may be nonlinear) and the Gateaux derivative (which they take to be linear). In most applications, continuous linearity follows from some more primitive condition which is natural to the particular setting, such as imposing complex differentiability in the context of infinite dimensional holomorphy or continuous differentiability in nonlinear analysis.

Definition

Suppose X {\displaystyle X} and Y {\displaystyle Y} are locally convex topological vector spaces (for example, Banach spaces), U X {\displaystyle U\subseteq X} is open, and F : U Y . {\displaystyle F:U\to Y.} The Gateaux differential d F ( u ; ψ ) {\displaystyle dF(u;\psi )} of F {\displaystyle F} at u U {\displaystyle u\in U} in the direction ψ X {\displaystyle \psi \in X} is defined as

d F ( u ; ψ ) = lim τ 0 F ( u + τ ψ ) F ( u ) τ = d d τ F ( u + τ ψ ) | τ = 0 {\displaystyle dF(u;\psi )=\lim _{\tau \to 0}{\frac {F(u+\tau \psi )-F(u)}{\tau }}=\left.{\frac {d}{d\tau }}F(u+\tau \psi )\right|_{\tau =0}}

(1)

If the limit exists for all ψ X , {\displaystyle \psi \in X,} then one says that F {\displaystyle F} is Gateaux differentiable at u . {\displaystyle u.}

The limit appearing in (1) is taken relative to the topology of Y . {\displaystyle Y.} If X {\displaystyle X} and Y {\displaystyle Y} are real topological vector spaces, then the limit is taken for real τ . {\displaystyle \tau .} On the other hand, if X {\displaystyle X} and Y {\displaystyle Y} are complex topological vector spaces, then the limit above is usually taken as τ 0 {\displaystyle \tau \to 0} in the complex plane as in the definition of complex differentiability. In some cases, a weak limit is taken instead of a strong limit, which leads to the notion of a weak Gateaux derivative.

Linearity and continuity

At each point u U , {\displaystyle u\in U,} the Gateaux differential defines a function

d F ( u ; ) : X Y . {\displaystyle dF(u;\cdot ):X\to Y.}

This function is homogeneous in the sense that for all scalars α , {\displaystyle \alpha ,}

d F ( u ; α ψ ) = α d F ( u ; ψ ) . {\displaystyle dF(u;\alpha \psi )=\alpha dF(u;\psi ).\,}

However, this function need not be additive, so that the Gateaux differential may fail to be linear, unlike the Fréchet derivative. Even if linear, it may fail to depend continuously on ψ {\displaystyle \psi } if X {\displaystyle X} and Y {\displaystyle Y} are infinite dimensional. Furthermore, for Gateaux differentials that are linear and continuous in ψ , {\displaystyle \psi ,} there are several inequivalent ways to formulate their continuous differentiability.

For example, consider the real-valued function F {\displaystyle F} of two real variables defined by

F ( x , y ) = { x 3 x 2 + y 2 if  ( x , y ) ( 0 , 0 ) , 0 if  ( x , y ) = ( 0 , 0 ) . {\displaystyle F(x,y)={\begin{cases}{\dfrac {x^{3}}{x^{2}+y^{2}}}&{\text{if }}(x,y)\neq (0,0),\\0&{\text{if }}(x,y)=(0,0).\end{cases}}}
This is Gateaux differentiable at ( 0 , 0 ) {\displaystyle (0,0)} with its differential there being
d F ( 0 , 0 ; a , b ) = { F ( τ a , τ b ) 0 τ ( a , b ) ( 0 , 0 ) , 0 ( a , b ) = ( 0 , 0 ) = { a 3 a 2 + b 2 ( a , b ) ( 0 , 0 ) , 0 ( a , b ) = ( 0 , 0 ) . {\displaystyle dF(0,0;a,b)={\begin{cases}{\dfrac {F(\tau a,\tau b)-0}{\tau }}&(a,b)\neq (0,0),\\0&(a,b)=(0,0)\end{cases}}={\begin{cases}{\dfrac {a^{3}}{a^{2}+b^{2}}}&(a,b)\neq (0,0),\\0&(a,b)=(0,0).\end{cases}}}
However this is continuous but not linear in the arguments ( a , b ) . {\displaystyle (a,b).} In infinite dimensions, any discontinuous linear functional on X {\displaystyle X} is Gateaux differentiable, but its Gateaux differential at 0 {\displaystyle 0} is linear but not continuous.

Relation with the Fréchet derivative

If F {\displaystyle F} is Fréchet differentiable, then it is also Gateaux differentiable, and its Fréchet and Gateaux derivatives agree. The converse is clearly not true, since the Gateaux derivative may fail to be linear or continuous. In fact, it is even possible for the Gateaux derivative to be linear and continuous but for the Fréchet derivative to fail to exist.

Nevertheless, for functions F {\displaystyle F} from a complex Banach space X {\displaystyle X} to another complex Banach space Y , {\displaystyle Y,} the Gateaux derivative (where the limit is taken over complex τ {\displaystyle \tau } tending to zero as in the definition of complex differentiability) is automatically linear, a theorem of Zorn (1945). Furthermore, if F {\displaystyle F} is (complex) Gateaux differentiable at each u U {\displaystyle u\in U} with derivative

D F ( u ) : ψ d F ( u ; ψ ) {\displaystyle DF(u)\colon \psi \mapsto dF(u;\psi )}
then F {\displaystyle F} is Fréchet differentiable on U {\displaystyle U} with Fréchet derivative D F {\displaystyle DF} (Zorn 1946). This is analogous to the result from basic complex analysis that a function is analytic if it is complex differentiable in an open set, and is a fundamental result in the study of infinite dimensional holomorphy.

Continuous differentiability

Continuous Gateaux differentiability may be defined in two inequivalent ways. Suppose that F : U Y {\displaystyle F\colon U\to Y} is Gateaux differentiable at each point of the open set U . {\displaystyle U.} One notion of continuous differentiability in U {\displaystyle U} requires that the mapping on the product space

d F : U × X Y {\displaystyle dF\colon U\times X\to Y}
be continuous. Linearity need not be assumed: if X {\displaystyle X} and Y {\displaystyle Y} are Fréchet spaces, then d F ( u ; ) {\displaystyle dF(u;\cdot )} is automatically bounded and linear for all u {\displaystyle u} (Hamilton 1982).

A stronger notion of continuous differentiability requires that

u D F ( u ) {\displaystyle u\mapsto DF(u)}
be a continuous mapping
U L ( X , Y ) {\displaystyle U\to L(X,Y)}
from U {\displaystyle U} to the space of continuous linear functions from X {\displaystyle X} to Y . {\displaystyle Y.} Note that this already presupposes the linearity of D F ( u ) . {\displaystyle DF(u).}

As a matter of technical convenience, this latter notion of continuous differentiability is typical (but not universal) when the spaces X {\displaystyle X} and Y {\displaystyle Y} are Banach, since L ( X , Y ) {\displaystyle L(X,Y)} is also Banach and standard results from functional analysis can then be employed. The former is the more common definition in areas of nonlinear analysis where the function spaces involved are not necessarily Banach spaces. For instance, differentiation in Fréchet spaces has applications such as the Nash–Moser inverse function theorem in which the function spaces of interest often consist of smooth functions on a manifold.

Higher derivatives

Whereas higher order Fréchet derivatives are naturally defined as multilinear functions by iteration, using the isomorphisms L n ( X , Y ) = L ( X , L n 1 ( X , Y ) ) , {\displaystyle L^{n}(X,Y)=L(X,L^{n-1}(X,Y)),} higher order Gateaux derivative cannot be defined in this way. Instead the n {\displaystyle n} th order Gateaux derivative of a function F : U X Y {\displaystyle F:U\subseteq X\to Y} in the direction h {\displaystyle h} is defined by

d n F ( u ; h ) = d n d τ n F ( u + τ h ) | τ = 0 . {\displaystyle d^{n}F(u;h)=\left.{\frac {d^{n}}{d\tau ^{n}}}F(u+\tau h)\right|_{\tau =0}.}

(2)

Rather than a multilinear function, this is instead a homogeneous function of degree n {\displaystyle n} in h . {\displaystyle h.}

There is another candidate for the definition of the higher order derivative, the function

D 2 F ( u ) { h , k } = lim τ 0 D F ( u + τ k ) h D F ( u ) h τ = 2 τ σ F ( u + σ h + τ k ) | τ = σ = 0 {\displaystyle D^{2}F(u)\{h,k\}=\lim _{\tau \to 0}{\frac {DF(u+\tau k)h-DF(u)h}{\tau }}=\left.{\frac {\partial ^{2}}{\partial \tau \,\partial \sigma }}F(u+\sigma h+\tau k)\right|_{\tau =\sigma =0}}

(3)

that arises naturally in the calculus of variations as the second variation of F , {\displaystyle F,} at least in the special case where F {\displaystyle F} is scalar-valued. However, this may fail to have any reasonable properties at all, aside from being separately homogeneous in h {\displaystyle h} and k . {\displaystyle k.} It is desirable to have sufficient conditions in place to ensure that D 2 F ( u ) { h , k } {\displaystyle D^{2}F(u)\{h,k\}} is a symmetric bilinear function of h {\displaystyle h} and k , {\displaystyle k,} and that it agrees with the polarization of d n F . {\displaystyle d^{n}F.}

For instance, the following sufficient condition holds (Hamilton 1982). Suppose that F {\displaystyle F} is C 1 {\displaystyle C^{1}} in the sense that the mapping

D F : U × X Y {\displaystyle DF:U\times X\to Y}
is continuous in the product topology, and moreover that the second derivative defined by (3) is also continuous in the sense that
D 2 F : U × X × X Y {\displaystyle D^{2}F:U\times X\times X\to Y}
is continuous. Then D 2 F ( u ) { h , k } {\displaystyle D^{2}F(u)\{h,k\}} is bilinear and symmetric in h {\displaystyle h} and k . {\displaystyle k.} By virtue of the bilinearity, the polarization identity holds
D 2 F ( u ) { h , k } = 1 2 d 2 F ( u ; h + k ) d 2 F ( u ; h ) d 2 F ( u ; k ) {\displaystyle D^{2}F(u)\{h,k\}={\frac {1}{2}}d^{2}F(u;h+k)-d^{2}F(u;h)-d^{2}F(u;k)}
relating the second order derivative D 2 F ( u ) {\displaystyle D^{2}F(u)} with the differential d 2 F ( u ; ) . {\displaystyle d^{2}F(u;-).} Similar conclusions hold for higher order derivatives.

Properties

A version of the fundamental theorem of calculus holds for the Gateaux derivative of F , {\displaystyle F,} provided F {\displaystyle F} is assumed to be sufficiently continuously differentiable. Specifically:

  • Suppose that F : X Y {\displaystyle F:X\to Y} is C 1 {\displaystyle C^{1}} in the sense that the Gateaux derivative is a continuous function d F : U × X Y . {\displaystyle dF:U\times X\to Y.} Then for any u U {\displaystyle u\in U} and h X , {\displaystyle h\in X,}
    F ( u + h ) F ( u ) = 0 1 d F ( u + t h ; h ) d t {\displaystyle F(u+h)-F(u)=\int _{0}^{1}dF(u+th;h)\,dt}
    where the integral is the Gelfand–Pettis integral (the weak integral) (Vainberg (1964)).

Many of the other familiar properties of the derivative follow from this, such as multilinearity and commutativity of the higher-order derivatives. Further properties, also consequences of the fundamental theorem, include:

  • (The chain rule)
    d ( G F ) ( u ; x ) = d G ( F ( u ) ; d F ( u ; x ) ) {\displaystyle d(G\circ F)(u;x)=dG(F(u);dF(u;x))}
    for all u U {\displaystyle u\in U} and x X . {\displaystyle x\in X.} (Importantly, as with simple partial derivatives, the Gateaux derivative does not satisfy the chain rule if the derivative is permitted to be discontinuous.)
  • (Taylor's theorem with remainder)
    Suppose that the line segment between u U {\displaystyle u\in U} and u + h {\displaystyle u+h} lies entirely within U . {\displaystyle U.} If F {\displaystyle F} is C k {\displaystyle C^{k}} then
    F ( u + h ) = F ( u ) + d F ( u ; h ) + 1 2 ! d 2 F ( u ; h ) + + 1 ( k 1 ) ! d k 1 F ( u ; h ) + R k {\displaystyle F(u+h)=F(u)+dF(u;h)+{\frac {1}{2!}}d^{2}F(u;h)+\dots +{\frac {1}{(k-1)!}}d^{k-1}F(u;h)+R_{k}}
    where the remainder term is given by
    R k ( u ; h ) = 1 ( k 1 ) ! 0 1 ( 1 t ) k 1 d k F ( u + t h ; h ) d t {\displaystyle R_{k}(u;h)={\frac {1}{(k-1)!}}\int _{0}^{1}(1-t)^{k-1}d^{k}F(u+th;h)\,dt}

Example

Let X {\displaystyle X} be the Hilbert space of square-integrable functions on a Lebesgue measurable set Ω {\displaystyle \Omega } in the Euclidean space R n . {\displaystyle \mathbb {R} ^{n}.} The functional

E : X R {\displaystyle E:X\to \mathbb {R} }
E ( u ) = Ω F ( u ( x ) ) d x {\displaystyle E(u)=\int _{\Omega }F(u(x))\,dx}
where F {\displaystyle F} is a real-valued function of a real variable and u {\displaystyle u} is defined on Ω {\displaystyle \Omega } with real values, has Gateaux derivative
d E ( u ; ψ ) = F ( u ) , ψ := Ω F ( u ( x ) ) ψ ( x ) d x . {\displaystyle dE(u;\psi )=\langle F'(u),\psi \rangle :=\int _{\Omega }F'(u(x))\,\psi (x)\,dx.}

Indeed, the above is the limit τ 0 {\displaystyle \tau \to 0} of

E ( u + τ ψ ) E ( u ) τ = 1 τ ( Ω F ( u + τ ψ ) d x Ω F ( u ) d x ) = 1 τ ( Ω 0 1 d d s F ( u + s τ ψ ) d s d x ) = Ω 0 1 F ( u + s τ ψ ) ψ d s d x . {\displaystyle {\begin{aligned}{\frac {E(u+\tau \psi )-E(u)}{\tau }}&={\frac {1}{\tau }}\left(\int _{\Omega }F(u+\tau \,\psi )\,dx-\int _{\Omega }F(u)\,dx\right)\\[6pt]&={\frac {1}{\tau }}\left(\int _{\Omega }\int _{0}^{1}{\frac {d}{ds}}F(u+s\,\tau \,\psi )\,ds\,dx\right)\\[6pt]&=\int _{\Omega }\int _{0}^{1}F'(u+s\tau \psi )\,\psi \,ds\,dx.\end{aligned}}}

See also

References

  • Gateaux, René (1913), "Sur les fonctionnelles continues et les fonctionnelles analytiques", Comptes rendus hebdomadaires des séances de l'Académie des sciences, 157, Paris: 325–327, retrieved 2 September 2012.
  • Gateaux, René (1919), "Fonctions d'une infinité de variables indépendantes", Bulletin de la Société Mathématique de France, 47: 70–96, doi:10.24033/bsmf.995.
  • Hamilton, R. S. (1982), "The inverse function theorem of Nash and Moser", Bull. Amer. Math. Soc., 7 (1): 65–222, doi:10.1090/S0273-0979-1982-15004-2, MR 0656198
  • Hille, Einar; Phillips, Ralph S. (1974), Functional analysis and semi-groups, Providence, R.I.: American Mathematical Society, MR 0423094.
  • Tikhomirov, V.M. (2001) [1994], "Gâteaux variation", Encyclopedia of Mathematics, EMS Press.
  • Vainberg, M.M. (1964), Variational Methods for the Study of Nonlinear Operators, San Francisco, London, Amsterdam: Holden-Day, Inc, p. 57
  • Zorn, Max (1945), "Characterization of analytic functions in Banach spaces", Annals of Mathematics, Second Series, 46 (4): 585–593, doi:10.2307/1969198, ISSN 0003-486X, JSTOR 1969198, MR 0014190.
  • Zorn, Max (1946), "Derivatives and Frechet differentials", Bulletin of the American Mathematical Society, 52 (2): 133–137, doi:10.1090/S0002-9904-1946-08524-9, MR 0014595.
  • v
  • t
  • e
Spaces
Properties
Theorems
Operators
Algebras
Open problems
Applications
Advanced topics
  • Category
  • v
  • t
  • e
Basic concepts
Derivatives
Measurability
Integrals
Results
Related
Functional calculus
Applications