Hilbert projection theorem

On closed convex subsets in Hilbert space

In mathematics, the Hilbert projection theorem is a famous result of convex analysis that says that for every vector x {\displaystyle x} in a Hilbert space H {\displaystyle H} and every nonempty closed convex C H , {\displaystyle C\subseteq H,} there exists a unique vector m C {\displaystyle m\in C} for which c x {\displaystyle \|c-x\|} is minimized over the vectors c C {\displaystyle c\in C} ; that is, such that m x c x {\displaystyle \|m-x\|\leq \|c-x\|} for every c C . {\displaystyle c\in C.}

Finite dimensional case

Some intuition for the theorem can be obtained by considering the first order condition of the optimization problem.

Consider a finite dimensional real Hilbert space H {\displaystyle H} with a subspace C {\displaystyle C} and a point x . {\displaystyle x.} If m C {\displaystyle m\in C} is a minimizer or minimum point of the function N : C R {\displaystyle N:C\to \mathbb {R} } defined by N ( c ) := c x {\displaystyle N(c):=\|c-x\|} (which is the same as the minimum point of c c x 2 {\displaystyle c\mapsto \|c-x\|^{2}} ), then derivative must be zero at m . {\displaystyle m.}

In matrix derivative notation[1]

x c 2 = c x , c x = 2 c x , c {\displaystyle {\begin{aligned}\partial \lVert x-c\rVert ^{2}&=\partial \langle c-x,c-x\rangle \\&=2\langle c-x,\partial c\rangle \end{aligned}}}
Since c {\displaystyle \partial c} is a vector in C {\displaystyle C} that represents an arbitrary tangent direction, it follows that m x {\displaystyle m-x} must be orthogonal to every vector in C . {\displaystyle C.}

Statement

Hilbert projection theorem — For every vector x {\displaystyle x} in a Hilbert space H {\displaystyle H} and every nonempty closed convex C H , {\displaystyle C\subseteq H,} there exists a unique vector m C {\displaystyle m\in C} for which x m {\displaystyle \lVert x-m\rVert } is equal to δ := inf c C x c . {\displaystyle \delta :=\inf _{c\in C}\|x-c\|.}

If the closed subset C {\displaystyle C} is also a vector subspace of H {\displaystyle H} then this minimizer m {\displaystyle m} is the unique element in C {\displaystyle C} such that x m {\displaystyle x-m} is orthogonal to C . {\displaystyle C.}

Detailed elementary proof

Proof that a minimum point y {\displaystyle y} exists

Let δ := inf c C x c {\displaystyle \delta :=\inf _{c\in C}\|x-c\|} be the distance between x {\displaystyle x} and C , {\displaystyle C,} ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} a sequence in C {\displaystyle C} such that the distance squared between x {\displaystyle x} and c n {\displaystyle c_{n}} is less than or equal to δ 2 + 1 / n . {\displaystyle \delta ^{2}+1/n.} Let n {\displaystyle n} and m {\displaystyle m} be two integers, then the following equalities are true:

c n c m 2 = c n x 2 + c m x 2 2 c n x , c m x {\displaystyle \left\|c_{n}-c_{m}\right\|^{2}=\left\|c_{n}-x\right\|^{2}+\left\|c_{m}-x\right\|^{2}-2\left\langle c_{n}-x\,,\,c_{m}-x\right\rangle }
and
4 c n + c m 2 x 2 = c n x 2 + c m x 2 + 2 c n x , c m x {\displaystyle 4\left\|{\frac {c_{n}+c_{m}}{2}}-x\right\|^{2}=\left\|c_{n}-x\right\|^{2}+\left\|c_{m}-x\right\|^{2}+2\left\langle c_{n}-x\,,\,c_{m}-x\right\rangle }
Therefore
c n c m 2 = 2 c n x 2 + 2 c m x 2 4 c n + c m 2 x 2 {\displaystyle \left\|c_{n}-c_{m}\right\|^{2}=2\left\|c_{n}-x\right\|^{2}+2\left\|c_{m}-x\right\|^{2}-4\left\|{\frac {c_{n}+c_{m}}{2}}-x\right\|^{2}}
(This equation is the same as the formula a 2 = 2 b 2 + 2 c 2 4 M a 2 {\displaystyle a^{2}=2b^{2}+2c^{2}-4M_{a}^{2}} for the length M a {\displaystyle M_{a}} of a median in a triangle with sides of length a , b , {\displaystyle a,b,} and c , {\displaystyle c,} where specifically, the triangle's vertices are x , c m , c n {\displaystyle x,c_{m},c_{n}} ).

By giving an upper bound to the first two terms of the equality and by noticing that the middle of c n {\displaystyle c_{n}} and c m {\displaystyle c_{m}} belong to C {\displaystyle C} and has therefore a distance greater than or equal to δ {\displaystyle \delta } from x , {\displaystyle x,} it follows that:

c n c m 2 2 ( δ 2 + 1 n ) + 2 ( δ 2 + 1 m ) 4 δ 2 = 2 ( 1 n + 1 m ) {\displaystyle \|c_{n}-c_{m}\|^{2}\;\leq \;2\left(\delta ^{2}+{\frac {1}{n}}\right)+2\left(\delta ^{2}+{\frac {1}{m}}\right)-4\delta ^{2}=2\left({\frac {1}{n}}+{\frac {1}{m}}\right)}

The last inequality proves that ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} is a Cauchy sequence. Since C {\displaystyle C} is complete, the sequence is therefore convergent to a point m C , {\displaystyle m\in C,} whose distance from x {\displaystyle x} is minimal. {\displaystyle \blacksquare }

Proof that m {\displaystyle m} is unique

Let m 1 {\displaystyle m_{1}} and m 2 {\displaystyle m_{2}} be two minimum points. Then:

m 2 m 1 2 = 2 m 1 x 2 + 2 m 2 x 2 4 m 1 + m 2 2 x 2 {\displaystyle \|m_{2}-m_{1}\|^{2}=2\|m_{1}-x\|^{2}+2\|m_{2}-x\|^{2}-4\left\|{\frac {m_{1}+m_{2}}{2}}-x\right\|^{2}}

Since m 1 + m 2 2 {\displaystyle {\frac {m_{1}+m_{2}}{2}}} belongs to C , {\displaystyle C,} we have m 1 + m 2 2 x 2 δ 2 {\displaystyle \left\|{\frac {m_{1}+m_{2}}{2}}-x\right\|^{2}\geq \delta ^{2}} and therefore

m 2 m 1 2 2 δ 2 + 2 δ 2 4 δ 2 = 0. {\displaystyle \|m_{2}-m_{1}\|^{2}\leq 2\delta ^{2}+2\delta ^{2}-4\delta ^{2}=0.}

Hence m 1 = m 2 , {\displaystyle m_{1}=m_{2},} which proves uniqueness. {\displaystyle \blacksquare }

Proof of characterization of minimum point when C {\displaystyle C} is a closed vector subspace

Assume that C {\displaystyle C} is a closed vector subspace of H . {\displaystyle H.} It must be shown the minimizer m {\displaystyle m} is the unique element in C {\displaystyle C} such that m x , c = 0 {\displaystyle \langle m-x,c\rangle =0} for every c C . {\displaystyle c\in C.}

Proof that the condition is sufficient: Let z C {\displaystyle z\in C} be such that z x , c = 0 {\displaystyle \langle z-x,c\rangle =0} for all c C . {\displaystyle c\in C.} If c C {\displaystyle c\in C} then c z C {\displaystyle c-z\in C} and so

c x 2 = ( z x ) + ( c z ) 2 = z x 2 + c z 2 + 2 z x , c z = z x 2 + c z 2 {\displaystyle \|c-x\|^{2}=\|(z-x)+(c-z)\|^{2}=\|z-x\|^{2}+\|c-z\|^{2}+2\langle z-x,c-z\rangle =\|z-x\|^{2}+\|c-z\|^{2}}
which implies that z x 2 c x 2 . {\displaystyle \|z-x\|^{2}\leq \|c-x\|^{2}.} Because c C {\displaystyle c\in C} was arbitrary, this proves that z x = inf c C c x {\displaystyle \|z-x\|=\inf _{c\in C}\|c-x\|} and so z {\displaystyle z} is a minimum point.

Proof that the condition is necessary: Let m C {\displaystyle m\in C} be the minimum point. Let c C {\displaystyle c\in C} and t R . {\displaystyle t\in \mathbb {R} .} Because m + t c C , {\displaystyle m+tc\in C,} the minimality of m {\displaystyle m} guarantees that m x ( m + t c ) x . {\displaystyle \|m-x\|\leq \|(m+tc)-x\|.} Thus

( m + t c ) x 2 m x 2 = 2 t m x , c + t 2 c 2 {\displaystyle \|(m+tc)-x\|^{2}-\|m-x\|^{2}=2t\langle m-x,c\rangle +t^{2}\|c\|^{2}}
is always non-negative and m x , c {\displaystyle \langle m-x,c\rangle } must be a real number. If m x , c 0 {\displaystyle \langle m-x,c\rangle \neq 0} then the map f ( t ) := 2 t m x , c + t 2 c 2 {\displaystyle f(t):=2t\langle m-x,c\rangle +t^{2}\|c\|^{2}} has a minimum at t 0 := m x , c c 2 {\displaystyle t_{0}:=-{\frac {\langle m-x,c\rangle }{\|c\|^{2}}}} and moreover, f ( t 0 ) < 0 , {\displaystyle f\left(t_{0}\right)<0,} which is a contradiction. Thus m x , c = 0. {\displaystyle \langle m-x,c\rangle =0.} {\displaystyle \blacksquare }

Proof by reduction to a special case

It suffices to prove the theorem in the case of x = 0 {\displaystyle x=0} because the general case follows from the statement below by replacing C {\displaystyle C} with C x . {\displaystyle C-x.}

Hilbert projection theorem (case x = 0 {\displaystyle x=0} )[2] — For every nonempty closed convex subset C H {\displaystyle C\subseteq H} of a Hilbert space H , {\displaystyle H,} there exists a unique vector m C {\displaystyle m\in C} such that inf c C c = m . {\displaystyle \inf _{c\in C}\|c\|=\|m\|.}

Furthermore, letting d := inf c C c , {\displaystyle d:=\inf _{c\in C}\|c\|,} if ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} is any sequence in C {\displaystyle C} such that lim n c n = d {\displaystyle \lim _{n\to \infty }\left\|c_{n}\right\|=d} in R {\displaystyle \mathbb {R} } [note 1] then lim n c n = m {\displaystyle \lim _{n\to \infty }c_{n}=m} in H . {\displaystyle H.}

Proof

Let C {\displaystyle C} be as described in this theorem and let

d := inf c C c . {\displaystyle d:=\inf _{c\in C}\|c\|.}
This theorem will follow from the following lemmas.

Lemma 1 — If c := ( c n ) n = 1 {\displaystyle c_{\bullet }:=\left(c_{n}\right)_{n=1}^{\infty }} is any sequence in C {\displaystyle C} such that lim n c n = d {\displaystyle \lim _{n\to \infty }\left\|c_{n}\right\|=d} in R {\displaystyle \mathbb {R} } then there exists some c C {\displaystyle c\in C} such that lim n c n = c {\displaystyle \lim _{n\to \infty }c_{n}=c} in H . {\displaystyle H.} Furthermore, c = d . {\displaystyle \|c\|=d.}

Proof of Lemma 1
Vectors involved in the parallelogram law: x + y 2 + x y 2 = 2 x 2 + 2 y 2 . {\displaystyle \|x+y\|^{2}+\|x-y\|^{2}=2\|x\|^{2}+2\|y\|^{2}.}

Because C {\displaystyle C} is convex, if m , n N {\displaystyle m,n\in \mathbb {N} } then 1 2 ( c m + c n ) C {\displaystyle {\frac {1}{2}}\left(c_{m}+c_{n}\right)\in C} so that by definition of the infimum, d 1 2 ( c m + c n ) , {\displaystyle d\leq \left\|{\frac {1}{2}}\left(c_{m}+c_{n}\right)\right\|,} which implies that 4 d 2 c m + c n 2 . {\displaystyle 4d^{2}\leq \left\|c_{m}+c_{n}\right\|^{2}.} By the parallelogram law,

c m + c n 2 + c m c n 2 = 2 c m 2 + 2 c n 2 {\displaystyle \left\|c_{m}+c_{n}\right\|^{2}+\left\|c_{m}-c_{n}\right\|^{2}=2\left\|c_{m}\right\|^{2}+2\left\|c_{n}\right\|^{2}}
where 4 d 2 c m + c n 2 {\displaystyle 4d^{2}\leq \left\|c_{m}+c_{n}\right\|^{2}} now implies
4 d 2 + c m c n 2     2 c m 2 + 2 c n 2 {\displaystyle 4d^{2}+\left\|c_{m}-c_{n}\right\|^{2}~\leq ~2\left\|c_{m}\right\|^{2}+2\left\|c_{n}\right\|^{2}}
and so
c m c n 2     2 c m 2 + 2 c n 2 4 d 2 {\displaystyle {\begin{alignedat}{4}\left\|c_{m}-c_{n}\right\|^{2}~\leq ~2\left\|c_{m}\right\|^{2}+2\left\|c_{n}\right\|^{2}-4d^{2}\end{alignedat}}}
The assumption lim n c n = d {\displaystyle \lim _{n\to \infty }\left\|c_{n}\right\|=d} implies that the right hand side (RHS) of the above inequality can be made arbitrary close to 0 {\displaystyle 0} by making m {\displaystyle m} and n {\displaystyle n} sufficiently large.[note 2] The same must consequently also be true of the inequality's left hand side c m c n 2 {\displaystyle \left\|c_{m}-c_{n}\right\|^{2}} and thus also of c m c n , {\displaystyle \left\|c_{m}-c_{n}\right\|,} which proves that ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} is a Cauchy sequence in H . {\displaystyle H.}

Since H {\displaystyle H} is complete, there exists some c H {\displaystyle c\in H} such that lim n c n = c {\displaystyle \lim _{n\to \infty }c_{n}=c} in H . {\displaystyle H.} Because every c n {\displaystyle c_{n}} belongs to C , {\displaystyle C,} which is a closed subset of H , {\displaystyle H,} their limit c {\displaystyle c} must also belongs to this closed subset, which proves that c C . {\displaystyle c\in C.} Since the norm : H R {\displaystyle \|\,\cdot \,\|:H\to \mathbb {R} } is a continuous function, lim n c n = c {\displaystyle \lim _{n\to \infty }c_{n}=c} in H {\displaystyle H} implies that lim n c n = c {\displaystyle \lim _{n\to \infty }\left\|c_{n}\right\|=\|c\|} in R . {\displaystyle \mathbb {R} .} But lim n c n = d {\displaystyle \lim _{n\to \infty }\left\|c_{n}\right\|=d} also holds (by assumption) so that c = d {\displaystyle \|c\|=d} (because limits in R {\displaystyle \mathbb {R} } are unique). {\displaystyle \blacksquare }

Lemma 2 — A sequence ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} satisfying the hypotheses of Lemma 1 exists.

Proof of Lemma 2

The existence of the sequence follows from the definition of the infimum, as is now shown. The set S := { c : c C } {\displaystyle S:=\{\|c\|:c\in C\}} is a non-empty subset of non-negative real numbers and d := inf c C c = inf S . {\displaystyle d:=\inf _{c\in C}\|c\|=\inf S.} Let n 1 {\displaystyle n\geq 1} be an integer. Because inf S < d + 1 n , {\displaystyle \inf S<d+{\frac {1}{n}},} there exists some s n S {\displaystyle s_{n}\in S} such that s n < d + 1 n . {\displaystyle s_{n}<d+{\frac {1}{n}}.} Since s n S , {\displaystyle s_{n}\in S,} d = inf S s n {\displaystyle d=\inf S\leq s_{n}} holds (by definition of the infimum). Thus d s n < d + 1 n {\displaystyle d\leq s_{n}<d+{\frac {1}{n}}} and now the squeeze theorem implies that lim n s n = d {\displaystyle \lim _{n\to \infty }s_{n}=d} in R . {\displaystyle \mathbb {R} .} (This first part of the proof works for any non-empty subset of S R {\displaystyle S\subseteq \mathbb {R} } for which d := inf s S s {\displaystyle d:=\inf _{s\in S}s} is finite).

For every n N , {\displaystyle n\in \mathbb {N} ,} the fact that s n S = { c : c C } {\displaystyle s_{n}\in S=\{\|c\|:c\in C\}} means that there exists some c n C {\displaystyle c_{n}\in C} such that s n = c n . {\displaystyle s_{n}=\left\|c_{n}\right\|.} The convergence lim n s n = d {\displaystyle \lim _{n\to \infty }s_{n}=d} in R {\displaystyle \mathbb {R} } thus becomes lim n c n = d {\displaystyle \lim _{n\to \infty }\left\|c_{n}\right\|=d} in R . {\displaystyle \mathbb {R} .} {\displaystyle \blacksquare }

Lemma 2 and Lemma 1 together prove that there exists some c C {\displaystyle c\in C} such that c = d . {\displaystyle \|c\|=d.} Lemma 1 can be used to prove uniqueness as follows. Suppose b C {\displaystyle b\in C} is such that b = d {\displaystyle \|b\|=d} and denote the sequence

b , c , b , c , b , c , {\displaystyle b,c,b,c,b,c,\ldots }
by ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} so that the subsequence ( c 2 n ) n = 1 {\displaystyle \left(c_{2n}\right)_{n=1}^{\infty }} of even indices is the constant sequence c , c , c , {\displaystyle c,c,c,\ldots } while the subsequence ( c 2 n 1 ) n = 1 {\displaystyle \left(c_{2n-1}\right)_{n=1}^{\infty }} of odd indices is the constant sequence b , b , b , . {\displaystyle b,b,b,\ldots .} Because c n = d {\displaystyle \left\|c_{n}\right\|=d} for every n N , {\displaystyle n\in \mathbb {N} ,} lim n c n = lim n d = d {\displaystyle \lim _{n\to \infty }\left\|c_{n}\right\|=\lim _{n\to \infty }d=d} in R , {\displaystyle \mathbb {R} ,} which shows that the sequence ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} satisfies the hypotheses of Lemma 1. Lemma 1 guarantees the existence of some x C {\displaystyle x\in C} such that lim n c n = x {\displaystyle \lim _{n\to \infty }c_{n}=x} in H . {\displaystyle H.} Because ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} converges to x , {\displaystyle x,} so do all of its subsequences. In particular, the subsequence c , c , c , {\displaystyle c,c,c,\ldots } converges to x , {\displaystyle x,} which implies that x = c {\displaystyle x=c} (because limits in H {\displaystyle H} are unique and this constant subsequence also converges to c {\displaystyle c} ). Similarly, x = b {\displaystyle x=b} because the subsequence b , b , b , {\displaystyle b,b,b,\ldots } converges to both x {\displaystyle x} and b . {\displaystyle b.} Thus b = c , {\displaystyle b=c,} which proves the theorem. {\displaystyle \blacksquare }

Consequences

Proposition — If C {\displaystyle C} is a closed vector subspace of a Hilbert space H {\displaystyle H} then[note 3]

H = C C . {\displaystyle H=C\oplus C^{\bot }.}

Proof[3]

Proof that C C = { 0 } {\displaystyle C\cap C^{\bot }=\{0\}} :

If c C C {\displaystyle c\in C\cap C^{\bot }} then 0 = c , c = c 2 , {\displaystyle 0=\langle \,c,\,c\,\rangle =\|c\|^{2},} which implies c = 0. {\displaystyle c=0.} {\displaystyle \blacksquare }


Proof that C {\displaystyle C^{\bot }} is a closed vector subspace of H {\displaystyle H} :

Let P := c C F {\displaystyle P:=\prod _{c\in C}\mathbb {F} } where F {\displaystyle \mathbb {F} } is the underlying scalar field of H {\displaystyle H} and define

L : H P h ( h , c ) c C {\displaystyle {\begin{alignedat}{4}L:\,&H&&\to \,&&P\\&h&&\mapsto \,&&\left(\langle \,h,\,c\,\rangle \right)_{c\in C}\\\end{alignedat}}}
which is continuous and linear because this is true of each of its coordinates h h , c . {\displaystyle h\mapsto \langle h,c\rangle .} The set C = L 1 ( 0 ) = L 1 ( { 0 } ) {\displaystyle C^{\bot }=L^{-1}(0)=L^{-1}\left(\{0\}\right)} is closed in H {\displaystyle H} because { 0 } {\displaystyle \{0\}} is closed in P {\displaystyle P} and L : H P {\displaystyle L:H\to P} is continuous. The kernel of any linear map is a vector subspace of its domain, which is why C = ker L {\displaystyle C^{\bot }=\ker L} is a vector subspace of H . {\displaystyle H.} {\displaystyle \blacksquare }


Proof that C + C = H {\displaystyle C+C^{\bot }=H} :

Let x H . {\displaystyle x\in H.} The Hilbert projection theorem guarantees the existence of a unique m C {\displaystyle m\in C} such that x m x c  for all  c C {\displaystyle \|x-m\|\leq \|x-c\|{\text{ for all }}c\in C} (or equivalently, for all x c x C {\displaystyle x-c\in x-C} ). Let p := x m {\displaystyle p:=x-m} so that x = m + p C + p {\displaystyle x=m+p\in C+p} and it remains to show that p C . {\displaystyle p\in C^{\bot }.} The inequality above can be rewritten as:

p z  for all  z x C . {\displaystyle \|p\|\leq \|z\|\quad {\text{ for all }}z\in x-C.}
Because m C {\displaystyle m\in C} and C {\displaystyle C} is a vector space, m + C = C {\displaystyle m+C=C} and C = C , {\displaystyle C=-C,} which implies that x C = x + C = p + m + C = p + C . {\displaystyle x-C=x+C=p+m+C=p+C.} The previous inequality thus becomes
p z  for all  z p + C . {\displaystyle \|p\|\leq \|z\|\quad {\text{ for all }}z\in p+C.}
or equivalently,
p p + c  for all  c C . {\displaystyle \|p\|\leq \|p+c\|\quad {\text{ for all }}c\in C.}
But this last statement is true if and only if p , c = 0 {\displaystyle \langle \,p,c\,\rangle =0} every c C . {\displaystyle c\in C.} Thus p C . {\displaystyle p\in C^{\bot }.} {\displaystyle \blacksquare }

Properties

Expression as a global minimum

The statement and conclusion of the Hilbert projection theorem can be expressed in terms of global minimums of the followings functions. Their notation will also be used to simplify certain statements.

Given a non-empty subset C H {\displaystyle C\subseteq H} and some x H , {\displaystyle x\in H,} define a function

d C , x : C [ 0 , )  by  c x c . {\displaystyle d_{C,x}:C\to [0,\infty )\quad {\text{ by }}c\mapsto \|x-c\|.}
A global minimum point of d C , x , {\displaystyle d_{C,x},} if one exists, is any point m {\displaystyle m} in domain d C , x = C {\displaystyle \,\operatorname {domain} d_{C,x}=C\,} such that
d C , x ( m ) d C , x ( c )  for all  c C , {\displaystyle d_{C,x}(m)\,\leq \,d_{C,x}(c)\quad {\text{ for all }}c\in C,}
in which case d C , x ( m ) = m x {\displaystyle d_{C,x}(m)=\|m-x\|} is equal to the global minimum value of the function d C , x , {\displaystyle d_{C,x},} which is:
inf c C d C , x ( c ) = inf c C x c . {\displaystyle \inf _{c\in C}d_{C,x}(c)=\inf _{c\in C}\|x-c\|.}

Effects of translations and scalings

When this global minimum point m {\displaystyle m} exists and is unique then denote it by min ( C , x ) ; {\displaystyle \min(C,x);} explicitly, the defining properties of min ( C , x ) {\displaystyle \min(C,x)} (if it exists) are:

min ( C , x ) C  and  x min ( C , x ) x c  for all  c C . {\displaystyle \min(C,x)\in C\quad {\text{ and }}\quad \left\|x-\min(C,x)\right\|\leq \|x-c\|\quad {\text{ for all }}c\in C.}
The Hilbert projection theorem guarantees that this unique minimum point exists whenever C {\displaystyle C} is a non-empty closed and convex subset of a Hilbert space. However, such a minimum point can also exist in non-convex or non-closed subsets as well; for instance, just as long is C {\displaystyle C} is non-empty, if x C {\displaystyle x\in C} then min ( C , x ) = x . {\displaystyle \min(C,x)=x.}

If C H {\displaystyle C\subseteq H} is a non-empty subset, s {\displaystyle s} is any scalar, and x , x 0 H {\displaystyle x,x_{0}\in H} are any vectors then

min ( s C + x 0 , s x + x 0 ) = s min ( C , x ) + x 0 {\displaystyle \,\min \left(sC+x_{0},sx+x_{0}\right)=s\min(C,x)+x_{0}}
which implies:
min ( s C , s x ) = s min ( C , x ) min ( C , x ) = min ( C , x ) {\displaystyle {\begin{alignedat}{6}\min &(sC,sx)&&=s&&\min(C,x)\\\min &(-C,-x)&&=-&&\min(C,x)\\\end{alignedat}}}
min ( C + x 0 , x + x 0 ) = min ( C , x ) + x 0 min ( C x 0 , x x 0 ) = min ( C , x ) x 0 {\displaystyle {\begin{alignedat}{6}\min \left(C+x_{0},x+x_{0}\right)&=\min(C,x)+x_{0}\\\min \left(C-x_{0},x-x_{0}\right)&=\min(C,x)-x_{0}\\\end{alignedat}}}
min ( C , x ) = min ( C + x , 0 ) x min ( C , 0 ) + x = min ( C + x , x ) min ( C x , 0 ) = min ( C , x ) x {\displaystyle {\begin{alignedat}{6}\min &(C,-x){}&&=\min(C+x,0)-x\\\min &(C,0)\;+\;x\;\;\;\;&&=\min(C+x,x)\\\min &(C-x,0){}&&=\min(C,x)-x\\\end{alignedat}}}

Examples

The following counter-example demonstrates a continuous linear isomorphism A : H H {\displaystyle A:H\to H} for which min ( A ( C ) , A ( x ) ) A ( min ( C , x ) ) . {\displaystyle \,\min(A(C),A(x))\neq A(\min(C,x)).} Endow H := R 2 {\displaystyle H:=\mathbb {R} ^{2}} with the dot product, let x 0 := ( 0 , 1 ) , {\displaystyle x_{0}:=(0,1),} and for every real s R , {\displaystyle s\in \mathbb {R} ,} let L s := { ( x , s x ) : x R } {\displaystyle L_{s}:=\{(x,sx):x\in \mathbb {R} \}} be the line of slope s {\displaystyle s} through the origin, where it is readily verified that min ( L s , x 0 ) = s 1 + s 2 ( 1 , s ) . {\displaystyle \min \left(L_{s},x_{0}\right)={\frac {s}{1+s^{2}}}(1,s).} Pick a real number r 0 {\displaystyle r\neq 0} and define A : R 2 R 2 {\displaystyle A:\mathbb {R} ^{2}\to \mathbb {R} ^{2}} by A ( x , y ) := ( r x , y ) {\displaystyle A(x,y):=(rx,y)} (so this map scales the x {\displaystyle x-} coordinate by r {\displaystyle r} while leaving the y {\displaystyle y-} coordinate unchanged). Then A : R 2 R 2 {\displaystyle A:\mathbb {R} ^{2}\to \mathbb {R} ^{2}} is an invertible continuous linear operator that satisfies A ( L s ) = L s / r {\displaystyle A\left(L_{s}\right)=L_{s/r}} and A ( x 0 ) = x 0 , {\displaystyle A\left(x_{0}\right)=x_{0},} so that min ( A ( L s ) , A ( x 0 ) ) = s r 2 + s 2 ( 1 , s ) {\displaystyle \,\min \left(A\left(L_{s}\right),A\left(x_{0}\right)\right)={\frac {s}{r^{2}+s^{2}}}(1,s)} and A ( min ( L s , x 0 ) ) = s 1 + s 2 ( r , s ) . {\displaystyle A\left(\min \left(L_{s},x_{0}\right)\right)={\frac {s}{1+s^{2}}}\left(r,s\right).} Consequently, if C := L s {\displaystyle C:=L_{s}} with s 0 {\displaystyle s\neq 0} and if ( r , s ) ( ± 1 , 1 ) {\displaystyle (r,s)\neq (\pm 1,1)} then min ( A ( C ) , A ( x 0 ) ) A ( min ( C , x 0 ) ) . {\displaystyle \,\min(A(C),A\left(x_{0}\right))\neq A\left(\min \left(C,x_{0}\right)\right).}

See also

  • Orthogonal complement – Concept in linear algebra
  • Orthogonal projection – Idempotent linear transformation from a vector space to itselfPages displaying short descriptions of redirect targets
  • Orthogonality principle – Condition for optimality of Bayesian estimator
  • Riesz representation theorem – Theorem about the dual of a Hilbert space

Notes

  1. ^ Because the norm : H R {\displaystyle \|\cdot \|:H\to \mathbb {R} } is continuous, if lim n x n {\displaystyle \lim _{n\to \infty }x_{n}} converges in H {\displaystyle H} then necessarily lim n x n {\displaystyle \lim _{n\to \infty }\left\|x_{n}\right\|} converges in R . {\displaystyle \mathbb {R} .} But in general, the converse is not guaranteed. However, under this theorem's hypotheses, knowing that lim n c n = d {\displaystyle \lim _{n\to \infty }\left\|c_{n}\right\|=d} in R {\displaystyle \mathbb {R} } is sufficient to conclude that lim n c n {\displaystyle \lim _{n\to \infty }c_{n}} converges in H . {\displaystyle H.}
  2. ^ Explicitly, this means that given any ϵ > 0 {\displaystyle \epsilon >0} there exists some integer N > 0 {\displaystyle N>0} such that "the quantity" is ϵ {\displaystyle \,\leq \epsilon } whenever m , n N . {\displaystyle m,n\geq N.} Here, "the quantity" refers to the inequality's right hand side 2 c m 2 + 2 c n 2 4 d 2 {\displaystyle 2\left\|c_{m}\right\|^{2}+2\left\|c_{n}\right\|^{2}-4d^{2}} and later in the proof, "the quantity" will also refer to c m c n 2 {\displaystyle \left\|c_{m}-c_{n}\right\|^{2}} and then c m c n . {\displaystyle \left\|c_{m}-c_{n}\right\|.} By definition of "Cauchy sequence," ( c n ) n = 1 {\displaystyle \left(c_{n}\right)_{n=1}^{\infty }} is Cauchy in H {\displaystyle H} if and only if "the quantity" c m c n {\displaystyle \left\|c_{m}-c_{n}\right\|} satisfies this aforementioned condition.
  3. ^ Technically, H = K K {\displaystyle H=K\oplus K^{\bot }} means that the addition map K × K H {\displaystyle K\times K^{\bot }\to H} defined by ( k , p ) k + p {\displaystyle (k,p)\mapsto k+p} is a surjective linear isomorphism and homeomorphism. See the article on complemented subspaces for more details.

References

  1. ^ Petersen, Kaare. "The Matrix Cookbook" (PDF). Retrieved 9 January 2021.
  2. ^ Rudin 1991, pp. 306–309.
  3. ^ Rudin 1991, pp. 307−309.

Bibliography

  • v
  • t
  • e
Spaces
Properties
TheoremsOperatorsAlgebrasOpen problemsApplicationsAdvanced topics
  • Category