Lehmann–Scheffé theorem

Theorem in statistics

In statistics, the Lehmann–Scheffé theorem is a prominent statement, tying together the ideas of completeness, sufficiency, uniqueness, and best unbiased estimation.[1] The theorem states that any estimator that is unbiased for a given unknown quantity and that depends on the data only through a complete, sufficient statistic is the unique best unbiased estimator of that quantity. The Lehmann–Scheffé theorem is named after Erich Leo Lehmann and Henry Scheffé, given their two early papers.[2][3]

If T is a complete sufficient statistic for θ and E(g(T)) = τ(θ) then g(T) is the uniformly minimum-variance unbiased estimator (UMVUE) of τ(θ).

Statement

Let X = X 1 , X 2 , , X n {\displaystyle {\vec {X}}=X_{1},X_{2},\dots ,X_{n}} be a random sample from a distribution that has p.d.f (or p.m.f in the discrete case) f ( x : θ ) {\displaystyle f(x:\theta )} where θ Ω {\displaystyle \theta \in \Omega } is a parameter in the parameter space. Suppose Y = u ( X ) {\displaystyle Y=u({\vec {X}})} is a sufficient statistic for θ, and let { f Y ( y : θ ) : θ Ω } {\displaystyle \{f_{Y}(y:\theta ):\theta \in \Omega \}} be a complete family. If φ : E [ φ ( Y ) ] = θ {\displaystyle \varphi :\operatorname {E} [\varphi (Y)]=\theta } then φ ( Y ) {\displaystyle \varphi (Y)} is the unique MVUE of θ.

Proof

By the Rao–Blackwell theorem, if Z {\displaystyle Z} is an unbiased estimator of θ then φ ( Y ) := E [ Z Y ] {\displaystyle \varphi (Y):=\operatorname {E} [Z\mid Y]} defines an unbiased estimator of θ with the property that its variance is not greater than that of Z {\displaystyle Z} .

Now we show that this function is unique. Suppose W {\displaystyle W} is another candidate MVUE estimator of θ. Then again ψ ( Y ) := E [ W Y ] {\displaystyle \psi (Y):=\operatorname {E} [W\mid Y]} defines an unbiased estimator of θ with the property that its variance is not greater than that of W {\displaystyle W} . Then

E [ φ ( Y ) ψ ( Y ) ] = 0 , θ Ω . {\displaystyle \operatorname {E} [\varphi (Y)-\psi (Y)]=0,\theta \in \Omega .}

Since { f Y ( y : θ ) : θ Ω } {\displaystyle \{f_{Y}(y:\theta ):\theta \in \Omega \}} is a complete family

E [ φ ( Y ) ψ ( Y ) ] = 0 φ ( y ) ψ ( y ) = 0 , θ Ω {\displaystyle \operatorname {E} [\varphi (Y)-\psi (Y)]=0\implies \varphi (y)-\psi (y)=0,\theta \in \Omega }

and therefore the function φ {\displaystyle \varphi } is the unique function of Y with variance not greater than that of any other unbiased estimator. We conclude that φ ( Y ) {\displaystyle \varphi (Y)} is the MVUE.

Example for when using a non-complete minimal sufficient statistic

An example of an improvable Rao–Blackwell improvement, when using a minimal sufficient statistic that is not complete, was provided by Galili and Meilijson in 2016.[4] Let X 1 , , X n {\displaystyle X_{1},\ldots ,X_{n}} be a random sample from a scale-uniform distribution X U ( ( 1 k ) θ , ( 1 + k ) θ ) , {\displaystyle X\sim U((1-k)\theta ,(1+k)\theta ),} with unknown mean E [ X ] = θ {\displaystyle \operatorname {E} [X]=\theta } and known design parameter k ( 0 , 1 ) {\displaystyle k\in (0,1)} . In the search for "best" possible unbiased estimators for θ {\displaystyle \theta } , it is natural to consider X 1 {\displaystyle X_{1}} as an initial (crude) unbiased estimator for θ {\displaystyle \theta } and then try to improve it. Since X 1 {\displaystyle X_{1}} is not a function of T = ( X ( 1 ) , X ( n ) ) {\displaystyle T=\left(X_{(1)},X_{(n)}\right)} , the minimal sufficient statistic for θ {\displaystyle \theta } (where X ( 1 ) = min i X i {\displaystyle X_{(1)}=\min _{i}X_{i}} and X ( n ) = max i X i {\displaystyle X_{(n)}=\max _{i}X_{i}} ), it may be improved using the Rao–Blackwell theorem as follows:

θ ^ R B = E θ [ X 1 X ( 1 ) , X ( n ) ] = X ( 1 ) + X ( n ) 2 . {\displaystyle {\hat {\theta }}_{RB}=\operatorname {E} _{\theta }[X_{1}\mid X_{(1)},X_{(n)}]={\frac {X_{(1)}+X_{(n)}}{2}}.}

However, the following unbiased estimator can be shown to have lower variance:

θ ^ L V = 1 k 2 n 1 n + 1 + 1 ( 1 k ) X ( 1 ) + ( 1 + k ) X ( n ) 2 . {\displaystyle {\hat {\theta }}_{LV}={\frac {1}{k^{2}{\frac {n-1}{n+1}}+1}}\cdot {\frac {(1-k)X_{(1)}+(1+k)X_{(n)}}{2}}.}

And in fact, it could be even further improved when using the following estimator:

θ ^ BAYES = n + 1 n [ 1 X ( 1 ) ( 1 + k ) X ( n ) ( 1 k ) 1 ( X ( 1 ) ( 1 + k ) X ( n ) ( 1 k ) ) n + 1 1 ] X ( n ) 1 + k {\displaystyle {\hat {\theta }}_{\text{BAYES}}={\frac {n+1}{n}}\left[1-{\frac {{\frac {X_{(1)}(1+k)}{X_{(n)}(1-k)}}-1}{\left({\frac {X_{(1)}(1+k)}{X_{(n)}(1-k)}}\right)^{n+1}-1}}\right]{\frac {X_{(n)}}{1+k}}}

The model is a scale model. Optimal equivariant estimators can then be derived for loss functions that are invariant.[5]

See also

References

  1. ^ Casella, George (2001). Statistical Inference. Duxbury Press. p. 369. ISBN 978-0-534-24312-8.
  2. ^ Lehmann, E. L.; Scheffé, H. (1950). "Completeness, similar regions, and unbiased estimation. I." Sankhyā. 10 (4): 305–340. doi:10.1007/978-1-4614-1412-4_23. JSTOR 25048038. MR 0039201.
  3. ^ Lehmann, E.L.; Scheffé, H. (1955). "Completeness, similar regions, and unbiased estimation. II". Sankhyā. 15 (3): 219–236. doi:10.1007/978-1-4614-1412-4_24. JSTOR 25048243. MR 0072410.
  4. ^ Tal Galili; Isaac Meilijson (31 Mar 2016). "An Example of an Improvable Rao–Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator". The American Statistician. 70 (1): 108–113. doi:10.1080/00031305.2015.1100683. PMC 4960505. PMID 27499547.
  5. ^ Taraldsen, Gunnar (2020). "Micha Mandel (2020), "The Scaled Uniform Model Revisited," The American Statistician, 74:1, 98–100: Comment". The American Statistician. 74 (3): 315. doi:10.1080/00031305.2020.1769727. S2CID 219493070.
  • v
  • t
  • e
Statistics
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
  • Z-test (normal)
  • Student's t-test
  • F-test
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
  • Category
  • icon Mathematics portal
  • Commons
  • WikiProject