Cramér–von Mises criterion

Statistical test

In statistics the Cramér–von Mises criterion is a criterion used for judging the goodness of fit of a cumulative distribution function F {\displaystyle F^{*}} compared to a given empirical distribution function F n {\displaystyle F_{n}} , or for comparing two empirical distributions. It is also used as a part of other algorithms, such as minimum distance estimation. It is defined as

ω 2 = [ F n ( x ) F ( x ) ] 2 d F ( x ) {\displaystyle \omega ^{2}=\int _{-\infty }^{\infty }[F_{n}(x)-F^{*}(x)]^{2}\,\mathrm {d} F^{*}(x)}

In one-sample applications F {\displaystyle F^{*}} is the theoretical distribution and F n {\displaystyle F_{n}} is the empirically observed distribution. Alternatively the two distributions can both be empirically estimated ones; this is called the two-sample case.

The criterion is named after Harald Cramér and Richard Edler von Mises who first proposed it in 1928–1930.[1][2] The generalization to two samples is due to Anderson.[3]

The Cramér–von Mises test is an alternative to the Kolmogorov–Smirnov test (1933).[4]

Cramér–von Mises test (one sample)

Let x 1 , x 2 , , x n {\displaystyle x_{1},x_{2},\ldots ,x_{n}} be the observed values, in increasing order. Then the statistic is[3]: 1153 [5]

T = n ω 2 = 1 12 n + i = 1 n [ 2 i 1 2 n F ( x i ) ] 2 . {\displaystyle T=n\omega ^{2}={\frac {1}{12n}}+\sum _{i=1}^{n}\left[{\frac {2i-1}{2n}}-F(x_{i})\right]^{2}.}

If this value is larger than the tabulated value, then the hypothesis that the data came from the distribution F {\displaystyle F} can be rejected.

Watson test

A modified version of the Cramér–von Mises test is the Watson test[6] which uses the statistic U2, where[5]

U 2 = T n ( F ¯ 1 2 ) 2 , {\displaystyle U^{2}=T-n({\bar {F}}-{\tfrac {1}{2}})^{2},}

where

F ¯ = 1 n i = 1 n F ( x i ) . {\displaystyle {\bar {F}}={\frac {1}{n}}\sum _{i=1}^{n}F(x_{i}).}

Cramér–von Mises test (two samples)

Let x 1 , x 2 , , x N {\displaystyle x_{1},x_{2},\ldots ,x_{N}} and y 1 , y 2 , , y M {\displaystyle y_{1},y_{2},\ldots ,y_{M}} be the observed values in the first and second sample respectively, in increasing order. Let r 1 , r 2 , , r N {\displaystyle r_{1},r_{2},\ldots ,r_{N}} be the ranks of the xs in the combined sample, and let s 1 , s 2 , , s M {\displaystyle s_{1},s_{2},\ldots ,s_{M}} be the ranks of the ys in the combined sample. Anderson[3]: 1149  shows that

T = N M N + M ω 2 = U N M ( N + M ) 4 M N 1 6 ( M + N ) {\displaystyle T={\frac {NM}{N+M}}\omega ^{2}={\frac {U}{NM(N+M)}}-{\frac {4MN-1}{6(M+N)}}}

where U is defined as

U = N i = 1 N ( r i i ) 2 + M j = 1 M ( s j j ) 2 {\displaystyle U=N\sum _{i=1}^{N}(r_{i}-i)^{2}+M\sum _{j=1}^{M}(s_{j}-j)^{2}}

If the value of T is larger than the tabulated values,[3]: 1154–1159  the hypothesis that the two samples come from the same distribution can be rejected. (Some books[specify] give critical values for U, which is more convenient, as it avoids the need to compute T via the expression above. The conclusion will be the same.)

The above assumes there are no duplicates in the x {\displaystyle x} , y {\displaystyle y} , and r {\displaystyle r} sequences. So x i {\displaystyle x_{i}} is unique, and its rank is i {\displaystyle i} in the sorted list x 1 , , x N {\displaystyle x_{1},\ldots ,x_{N}} . If there are duplicates, and x i {\displaystyle x_{i}} through x j {\displaystyle x_{j}} are a run of identical values in the sorted list, then one common approach is the midrank[7] method: assign each duplicate a "rank" of ( i + j ) / 2 {\displaystyle (i+j)/2} . In the above equations, in the expressions ( r i i ) 2 {\displaystyle (r_{i}-i)^{2}} and ( s j j ) 2 {\displaystyle (s_{j}-j)^{2}} , duplicates can modify all four variables r i {\displaystyle r_{i}} , i {\displaystyle i} , s j {\displaystyle s_{j}} , and j {\displaystyle j} .

References

  1. ^ Cramér, H. (1928). "On the Composition of Elementary Errors". Scandinavian Actuarial Journal. 1928 (1): 13–74. doi:10.1080/03461238.1928.10416862.
  2. ^ von Mises, R. E. (1928). Wahrscheinlichkeit, Statistik und Wahrheit. Julius Springer.
  3. ^ a b c d Anderson, T. W. (1962). "On the Distribution of the Two-Sample Cramer–von Mises Criterion" (PDF). Annals of Mathematical Statistics. 33 (3). Institute of Mathematical Statistics: 1148–1159. doi:10.1214/aoms/1177704477. ISSN 0003-4851. Retrieved June 12, 2009.
  4. ^ A.N. Kolmogorov, "Sulla determinizione empirica di una legge di distribuzione" Giorn. Ist. Ital. Attuari , 4 (1933) pp. 83–91
  5. ^ a b Pearson, E.S., Hartley, H.O. (1972) Biometrika Tables for Statisticians, Volume 2, CUP. ISBN 0-521-06937-8 (page 118 and Table 54)
  6. ^ Watson, G.S. (1961) "Goodness-Of-Fit Tests on a Circle", Biometrika, 48 (1/2), 109-114 JSTOR 2333135
  7. ^ Ruymgaart, F. H., (1980) "A unified approach to the asymptotic distribution theory of certain midrank statistics". In: Statistique non Parametrique Asymptotique, 1±18, J. P. Raoult (Ed.), Lecture Notes on Mathematics, No. 821, Springer, Berlin.
  • M. A. Stephens (1986). "Tests Based on EDF Statistics". In D'Agostino, R.B.; Stephens, M.A. (eds.). Goodness-of-Fit Techniques. New York: Marcel Dekker. ISBN 0-8247-7487-6.

Further reading

  • Xiao, Y.; A. Gordon; A. Yakovlev (January 2007). "A C++ Program for the Cramér–von Mises Two-Sample Test" (PDF). Journal of Statistical Software. 17 (8). doi:10.18637/jss.v017.i08. ISSN 1548-7660. OCLC 42456366. S2CID 54098783. Retrieved June 12, 2009.