V-statistic

Statistics named for Richard von Mises

V-statistics are a class of statistics named for Richard von Mises who developed their asymptotic distribution theory in a fundamental paper in 1947.[1] V-statistics are closely related to U-statistics[2][3] (U for "unbiased") introduced by Wassily Hoeffding in 1948.[4] A V-statistic is a statistical function (of a sample) defined by a particular statistical functional of a probability distribution.

Statistical functions

Statistics that can be represented as functionals T ( F n ) {\displaystyle T(F_{n})} of the empirical distribution function ( F n ) {\displaystyle (F_{n})} are called statistical functionals.[5] Differentiability of the functional T plays a key role in the von Mises approach; thus von Mises considers differentiable statistical functionals.[1]

Examples of statistical functions

  1. The k-th central moment is the functional T ( F ) = ( x μ ) k d F ( x ) {\displaystyle T(F)=\int (x-\mu )^{k}\,dF(x)} , where μ = E [ X ] {\displaystyle \mu =E[X]} is the expected value of X. The associated statistical function is the sample k-th central moment,
    T n = m k = T ( F n ) = 1 n i = 1 n ( x i x ¯ ) k . {\displaystyle T_{n}=m_{k}=T(F_{n})={\frac {1}{n}}\sum _{i=1}^{n}(x_{i}-{\overline {x}})^{k}.}
  2. The chi-squared goodness-of-fit statistic is a statistical function T(Fn), corresponding to the statistical functional
    T ( F ) = i = 1 k ( A i d F p i ) 2 p i , {\displaystyle T(F)=\sum _{i=1}^{k}{\frac {(\int _{A_{i}}\,dF-p_{i})^{2}}{p_{i}}},}
    where Ai are the k cells and pi are the specified probabilities of the cells under the null hypothesis.
  3. The Cramér–von-Mises and Anderson–Darling goodness-of-fit statistics are based on the functional
    T ( F ) = ( F ( x ) F 0 ( x ) ) 2 w ( x ; F 0 ) d F 0 ( x ) , {\displaystyle T(F)=\int (F(x)-F_{0}(x))^{2}\,w(x;F_{0})\,dF_{0}(x),}
    where w(xF0) is a specified weight function and F0 is a specified null distribution. If w is the identity function then T(Fn) is the well known Cramér–von-Mises goodness-of-fit statistic; if w ( x ; F 0 ) = [ F 0 ( x ) ( 1 F 0 ( x ) ) ] 1 {\displaystyle w(x;F_{0})=[F_{0}(x)(1-F_{0}(x))]^{-1}} then T(Fn) is the Anderson–Darling statistic.

Representation as a V-statistic

Suppose x1, ..., xn is a sample. In typical applications the statistical function has a representation as the V-statistic

V m n = 1 n m i 1 = 1 n i m = 1 n h ( x i 1 , x i 2 , , x i m ) , {\displaystyle V_{mn}={\frac {1}{n^{m}}}\sum _{i_{1}=1}^{n}\cdots \sum _{i_{m}=1}^{n}h(x_{i_{1}},x_{i_{2}},\dots ,x_{i_{m}}),}

where h is a symmetric kernel function. Serfling[6] discusses how to find the kernel in practice. Vmn is called a V-statistic of degree m.

A symmetric kernel of degree 2 is a function h(xy), such that h(x, y) = h(y, x) for all x and y in the domain of h. For samples x1, ..., xn, the corresponding V-statistic is defined

V 2 , n = 1 n 2 i = 1 n j = 1 n h ( x i , x j ) . {\displaystyle V_{2,n}={\frac {1}{n^{2}}}\sum _{i=1}^{n}\sum _{j=1}^{n}h(x_{i},x_{j}).}

Example of a V-statistic

  1. An example of a degree-2 V-statistic is the second central moment m2. If h(x, y) = (xy)2/2, the corresponding V-statistic is
    V 2 , n = 1 n 2 i = 1 n j = 1 n 1 2 ( x i x j ) 2 = 1 n i = 1 n ( x i x ¯ ) 2 , {\displaystyle V_{2,n}={\frac {1}{n^{2}}}\sum _{i=1}^{n}\sum _{j=1}^{n}{\frac {1}{2}}(x_{i}-x_{j})^{2}={\frac {1}{n}}\sum _{i=1}^{n}(x_{i}-{\bar {x}})^{2},}
    which is the maximum likelihood estimator of variance. With the same kernel, the corresponding U-statistic is the (unbiased) sample variance:
    s 2 = ( n 2 ) 1 i < j 1 2 ( x i x j ) 2 = 1 n 1 i = 1 n ( x i x ¯ ) 2 {\displaystyle s^{2}={n \choose 2}^{-1}\sum _{i<j}{\frac {1}{2}}(x_{i}-x_{j})^{2}={\frac {1}{n-1}}\sum _{i=1}^{n}(x_{i}-{\bar {x}})^{2}} .

Asymptotic distribution

In examples 1–3, the asymptotic distribution of the statistic is different: in (1) it is normal, in (2) it is chi-squared, and in (3) it is a weighted sum of chi-squared variables.

Von Mises' approach is a unifying theory that covers all of the cases above.[1] Informally, the type of asymptotic distribution of a statistical function depends on the order of "degeneracy," which is determined by which term is the first non-vanishing term in the Taylor expansion of the functional T. In case it is the linear term, the limit distribution is normal; otherwise higher order types of distributions arise (under suitable conditions such that a central limit theorem holds).

There are a hierarchy of cases parallel to asymptotic theory of U-statistics.[7] Let A(m) be the property defined by:

A(m):
  1. Var(h(X1, ..., Xk)) = 0 for k < m, and Var(h(X1, ..., Xk)) > 0 for k = m;
  2. nm/2Rmn tends to zero (in probability). (Rmn is the remainder term in the Taylor series for T.)

Case m = 1 (Non-degenerate kernel):

If A(1) is true, the statistic is a sample mean and the Central Limit Theorem implies that T(Fn) is asymptotically normal.

In the variance example (4), m2 is asymptotically normal with mean σ 2 {\displaystyle \sigma ^{2}} and variance ( μ 4 σ 4 ) / n {\displaystyle (\mu _{4}-\sigma ^{4})/n} , where μ 4 = E ( X E ( X ) ) 4 {\displaystyle \mu _{4}=E(X-E(X))^{4}} .

Case m = 2 (Degenerate kernel):

Suppose A(2) is true, and E [ h 2 ( X 1 , X 2 ) ] < , E | h ( X 1 , X 1 ) | < , {\displaystyle E[h^{2}(X_{1},X_{2})]<\infty ,\,E|h(X_{1},X_{1})|<\infty ,} and E [ h ( x , X 1 ) ] 0 {\displaystyle E[h(x,X_{1})]\equiv 0} . Then nV2,n converges in distribution to a weighted sum of independent chi-squared variables:

n V 2 , n d k = 1 λ k Z k 2 , {\displaystyle nV_{2,n}{\stackrel {d}{\longrightarrow }}\sum _{k=1}^{\infty }\lambda _{k}Z_{k}^{2},}

where Z k {\displaystyle Z_{k}} are independent standard normal variables and λ k {\displaystyle \lambda _{k}} are constants that depend on the distribution F and the functional T. In this case the asymptotic distribution is called a quadratic form of centered Gaussian random variables. The statistic V2,n is called a degenerate kernel V-statistic. The V-statistic associated with the Cramer–von Mises functional[1] (Example 3) is an example of a degenerate kernel V-statistic.[8]

See also

Notes

  1. ^ a b c d von Mises (1947)
  2. ^ Lee (1990)
  3. ^ Koroljuk & Borovskich (1994)
  4. ^ Hoeffding (1948)
  5. ^ von Mises (1947), p. 309; Serfling (1980), p. 210.
  6. ^ Serfling (1980, Section 6.5)
  7. ^ Serfling (1980, Ch. 5–6); Lee (1990, Ch. 3)
  8. ^ See Lee (1990, p. 160) for the kernel function.

References

  • Hoeffding, W. (1948). "A class of statistics with asymptotically normal distribution". Annals of Mathematical Statistics. 19 (3): 293–325. doi:10.1214/aoms/1177730196. JSTOR 2235637.
  • Koroljuk, V.S.; Borovskich, Yu.V. (1994). Theory of U-statistics (English translation by P.V.Malyshev and D.V.Malyshev from the 1989 Ukrainian ed.). Dordrecht: Kluwer Academic Publishers. ISBN 0-7923-2608-3.
  • Lee, A.J. (1990). U-Statistics: theory and practice. New York: Marcel Dekker, Inc. ISBN 0-8247-8253-4.
  • Neuhaus, G. (1977). "Functional limit theorems for U-statistics in the degenerate case". Journal of Multivariate Analysis. 7 (3): 424–439. doi:10.1016/0047-259X(77)90083-5.
  • Rosenblatt, M. (1952). "Limit theorems associated with variants of the von Mises statistic". Annals of Mathematical Statistics. 23 (4): 617–623. doi:10.1214/aoms/1177729341. JSTOR 2236587.
  • Serfling, R.J. (1980). Approximation theorems of mathematical statistics. New York: John Wiley & Sons. ISBN 0-471-02403-1.
  • Taylor, R.L.; Daffer, P.Z.; Patterson, R.F. (1985). Limit theorems for sums of exchangeable random variables. New Jersey: Rowman and Allanheld.
  • von Mises, R. (1947). "On the asymptotic distribution of differentiable statistical functions". Annals of Mathematical Statistics. 18 (2): 309–348. doi:10.1214/aoms/1177730385. JSTOR 2235734.
  • v
  • t
  • e
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
  • Z-test (normal)
  • Student's t-test
  • F-test
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
  • Category
  • icon Mathematics portal
  • Commons
  • WikiProject