Idempotent matrix

Matrix that, squared, equals itself

In linear algebra, an idempotent matrix is a matrix which, when multiplied by itself, yields itself.[1][2] That is, the matrix A {\displaystyle A} is idempotent if and only if A 2 = A {\displaystyle A^{2}=A} . For this product A 2 {\displaystyle A^{2}} to be defined, A {\displaystyle A} must necessarily be a square matrix. Viewed this way, idempotent matrices are idempotent elements of matrix rings.

Example

Examples of 2 × 2 {\displaystyle 2\times 2} idempotent matrices are:

[ 1 0 0 1 ] [ 3 6 1 2 ] {\displaystyle {\begin{bmatrix}1&0\\0&1\end{bmatrix}}\qquad {\begin{bmatrix}3&-6\\1&-2\end{bmatrix}}}

Examples of 3 × 3 {\displaystyle 3\times 3} idempotent matrices are:

[ 1 0 0 0 1 0 0 0 1 ] [ 2 2 4 1 3 4 1 2 3 ] {\displaystyle {\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}\qquad {\begin{bmatrix}2&-2&-4\\-1&3&4\\1&-2&-3\end{bmatrix}}}

Real 2 × 2 case

If a matrix ( a b c d ) {\displaystyle {\begin{pmatrix}a&b\\c&d\end{pmatrix}}} is idempotent, then

  • a = a 2 + b c , {\displaystyle a=a^{2}+bc,}
  • b = a b + b d , {\displaystyle b=ab+bd,} implying b ( 1 a d ) = 0 {\displaystyle b(1-a-d)=0} so b = 0 {\displaystyle b=0} or d = 1 a , {\displaystyle d=1-a,}
  • c = c a + c d , {\displaystyle c=ca+cd,} implying c ( 1 a d ) = 0 {\displaystyle c(1-a-d)=0} so c = 0 {\displaystyle c=0} or d = 1 a , {\displaystyle d=1-a,}
  • d = b c + d 2 . {\displaystyle d=bc+d^{2}.}

Thus, a necessary condition for a 2 × 2 {\displaystyle 2\times 2} matrix to be idempotent is that either it is diagonal or its trace equals 1. For idempotent diagonal matrices, a {\displaystyle a} and d {\displaystyle d} must be either 1 or 0.

If b = c {\displaystyle b=c} , the matrix ( a b b 1 a ) {\displaystyle {\begin{pmatrix}a&b\\b&1-a\end{pmatrix}}} will be idempotent provided a 2 + b 2 = a , {\displaystyle a^{2}+b^{2}=a,} so a satisfies the quadratic equation

a 2 a + b 2 = 0 , {\displaystyle a^{2}-a+b^{2}=0,} or ( a 1 2 ) 2 + b 2 = 1 4 {\displaystyle \left(a-{\frac {1}{2}}\right)^{2}+b^{2}={\frac {1}{4}}}

which is a circle with center (1/2, 0) and radius 1/2. In terms of an angle θ,

A = 1 2 ( 1 cos θ sin θ sin θ 1 + cos θ ) {\displaystyle A={\frac {1}{2}}{\begin{pmatrix}1-\cos \theta &\sin \theta \\\sin \theta &1+\cos \theta \end{pmatrix}}} is idempotent.

However, b = c {\displaystyle b=c} is not a necessary condition: any matrix

( a b c 1 a ) {\displaystyle {\begin{pmatrix}a&b\\c&1-a\end{pmatrix}}} with a 2 + b c = a {\displaystyle a^{2}+bc=a} is idempotent.

Properties

Singularity and regularity

The only non-singular idempotent matrix is the identity matrix; that is, if a non-identity matrix is idempotent, its number of independent rows (and columns) is less than its number of rows (and columns).

This can be seen from writing A 2 = A {\displaystyle A^{2}=A} , assuming that A has full rank (is non-singular), and pre-multiplying by A 1 {\displaystyle A^{-1}} to obtain A = I A = A 1 A 2 = A 1 A = I {\displaystyle A=IA=A^{-1}A^{2}=A^{-1}A=I} .

When an idempotent matrix is subtracted from the identity matrix, the result is also idempotent. This holds since

( I A ) ( I A ) = I A A + A 2 = I A A + A = I A . {\displaystyle (I-A)(I-A)=I-A-A+A^{2}=I-A-A+A=I-A.}

If a matrix A is idempotent then for all positive integers n, A n = A {\displaystyle A^{n}=A} . This can be shown using proof by induction. Clearly we have the result for n = 1 {\displaystyle n=1} , as A 1 = A {\displaystyle A^{1}=A} . Suppose that A k 1 = A {\displaystyle A^{k-1}=A} . Then, A k = A k 1 A = A A = A {\displaystyle A^{k}=A^{k-1}A=AA=A} , since A is idempotent. Hence by the principle of induction, the result follows.

Eigenvalues

An idempotent matrix is always diagonalizable.[3] Its eigenvalues are either 0 or 1: if x {\displaystyle \mathbf {x} } is a non-zero eigenvector of some idempotent matrix A {\displaystyle A} and λ {\displaystyle \lambda } its associated eigenvalue, then λ x = A x = A 2 x = A λ x = λ A x = λ 2 x , {\textstyle \lambda \mathbf {x} =A\mathbf {x} =A^{2}\mathbf {x} =A\lambda \mathbf {x} =\lambda A\mathbf {x} =\lambda ^{2}\mathbf {x} ,} which implies λ { 0 , 1 } . {\displaystyle \lambda \in \{0,1\}.} This further implies that the determinant of an idempotent matrix is always 0 or 1. As stated above, if the determinant is equal to one, the matrix is invertible and is therefore the identity matrix.

Trace

The trace of an idempotent matrix — the sum of the elements on its main diagonal — equals the rank of the matrix and thus is always an integer. This provides an easy way of computing the rank, or alternatively an easy way of determining the trace of a matrix whose elements are not specifically known (which is helpful in statistics, for example, in establishing the degree of bias in using a sample variance as an estimate of a population variance).

Relationships between idempotent matrices

In regression analysis, the matrix M = I X ( X X ) 1 X {\displaystyle M=I-X(X'X)^{-1}X'} is known to produce the residuals e {\displaystyle e} from the regression of the vector of dependent variables y {\displaystyle y} on the matrix of covariates X {\displaystyle X} . (See the section on Applications.) Now, let X 1 {\displaystyle X_{1}} be a matrix formed from a subset of the columns of X {\displaystyle X} , and let M 1 = I X 1 ( X 1 X 1 ) 1 X 1 {\displaystyle M_{1}=I-X_{1}(X_{1}'X_{1})^{-1}X_{1}'} . It is easy to show that both M {\displaystyle M} and M 1 {\displaystyle M_{1}} are idempotent, but a somewhat surprising fact is that M M 1 = M {\displaystyle MM_{1}=M} . This is because M X 1 = 0 {\displaystyle MX_{1}=0} , or in other words, the residuals from the regression of the columns of X 1 {\displaystyle X_{1}} on X {\displaystyle X} are 0 since X 1 {\displaystyle X_{1}} can be perfectly interpolated as it is a subset of X {\displaystyle X} (by direct substitution it is also straightforward to show that M X = 0 {\displaystyle MX=0} ). This leads to two other important results: one is that ( M 1 M ) {\displaystyle (M_{1}-M)} is symmetric and idempotent, and the other is that ( M 1 M ) M = 0 {\displaystyle (M_{1}-M)M=0} , i.e., ( M 1 M ) {\displaystyle (M_{1}-M)} is orthogonal to M {\displaystyle M} . These results play a key role, for example, in the derivation of the F test.

Any similar matrices of an idempotent matrix are also idempotent. Idempotency is conserved under a change of basis. This can be shown through multiplication of the transformed matrix S A S 1 {\displaystyle SAS^{-1}} with A {\displaystyle A} being idempotent: ( S A S 1 ) 2 = ( S A S 1 ) ( S A S 1 ) = S A ( S 1 S ) A S 1 = S A 2 S 1 = S A S 1 {\displaystyle (SAS^{-1})^{2}=(SAS^{-1})(SAS^{-1})=SA(S^{-1}S)AS^{-1}=SA^{2}S^{-1}=SAS^{-1}} .

Applications

Idempotent matrices arise frequently in regression analysis and econometrics. For example, in ordinary least squares, the regression problem is to choose a vector β of coefficient estimates so as to minimize the sum of squared residuals (mispredictions) ei: in matrix form,

Minimize ( y X β ) T ( y X β ) {\displaystyle (y-X\beta )^{\textsf {T}}(y-X\beta )}

where y {\displaystyle y} is a vector of dependent variable observations, and X {\displaystyle X} is a matrix each of whose columns is a column of observations on one of the independent variables. The resulting estimator is

β ^ = ( X T X ) 1 X T y {\displaystyle {\hat {\beta }}=\left(X^{\textsf {T}}X\right)^{-1}X^{\textsf {T}}y}

where superscript T indicates a transpose, and the vector of residuals is[2]

e ^ = y X β ^ = y X ( X T X ) 1 X T y = [ I X ( X T X ) 1 X T ] y = M y . {\displaystyle {\hat {e}}=y-X{\hat {\beta }}=y-X\left(X^{\textsf {T}}X\right)^{-1}X^{\textsf {T}}y=\left[I-X\left(X^{\textsf {T}}X\right)^{-1}X^{\textsf {T}}\right]y=My.}

Here both M {\displaystyle M} and X ( X T X ) 1 X T {\displaystyle X\left(X^{\textsf {T}}X\right)^{-1}X^{\textsf {T}}} (the latter being known as the hat matrix) are idempotent and symmetric matrices, a fact which allows simplification when the sum of squared residuals is computed:

e ^ T e ^ = ( M y ) T ( M y ) = y T M T M y = y T M M y = y T M y . {\displaystyle {\hat {e}}^{\textsf {T}}{\hat {e}}=(My)^{\textsf {T}}(My)=y^{\textsf {T}}M^{\textsf {T}}My=y^{\textsf {T}}MMy=y^{\textsf {T}}My.}

The idempotency of M {\displaystyle M} plays a role in other calculations as well, such as in determining the variance of the estimator β ^ {\displaystyle {\hat {\beta }}} .

An idempotent linear operator P {\displaystyle P} is a projection operator on the range space R ( P ) {\displaystyle R(P)} along its null space N ( P ) {\displaystyle N(P)} . P {\displaystyle P} is an orthogonal projection operator if and only if it is idempotent and symmetric.

See also

References

  1. ^ Chiang, Alpha C. (1984). Fundamental Methods of Mathematical Economics (3rd ed.). New York: McGraw–Hill. p. 80. ISBN 0070108137.
  2. ^ a b Greene, William H. (2003). Econometric Analysis (5th ed.). Upper Saddle River, NJ: Prentice–Hall. pp. 808–809. ISBN 0130661899.
  3. ^ Horn, Roger A.; Johnson, Charles R. (1990). Matrix analysis. Cambridge University Press. p. p. 148. ISBN 0521386322.
  • v
  • t
  • e
Matrix classes
Explicitly constrained entriesConstantConditions on eigenvalues or eigenvectorsSatisfying conditions on products or inversesWith specific applicationsUsed in statisticsUsed in graph theoryUsed in science and engineeringRelated terms